Boyle-Lab / F-Seq2

Improving the feature density based peak caller with dynamic statistics
GNU General Public License v3.0
6 stars 0 forks source link

pandas associated KeyError: 'query_value' ? #2

Closed jpsmith5 closed 3 years ago

jpsmith5 commented 3 years ago

Hello, I was testing on a handful of samples and received the following KeyError. This error occurs in versions 2.0.0/2.0.1/2.0.2. Current pandas version is 1.2.3. Not sure if this is related to the chromosomes with no peaks or something else. Happy to provide any additional information that would be helpful.

$ fseq2 callpeak g2.bed -f 0 -l 600 -t 4.0 -name g2 -cpus 32 -pe -pe_fragment_size_range auto
no peaks find on chr14_GL000225v1_random
no peaks find on chr14_KI270724v1_random
no peaks find on chr22_KI270736v1_random
no peaks find on chr22_KI270738v1_random
no peaks find on chr9_KI270720v1_random
no peaks find on chrUn_GL000224v1
no peaks find on chrUn_KI270438v1
no peaks find on chrUn_KI270467v1
no peaks find on chrUn_KI270511v1
no peaks find on chrUn_KI270746v1
no peaks find on chrUn_KI270756v1
no peaks find on chrUn_KI270750v1
no peaks find on chr17_GL000205v2_random
no peaks find on chr22_KI270735v1_random
no peaks find on chr17_KI270729v1_random
no peaks find on chrUn_KI270336v1
no peaks find on chrUn_KI270337v1
no peaks find on chrUn_KI270333v1
no peaks find on chrUn_KI270507v1
no peaks find on chrUn_KI270747v1
no peaks find on chrUn_KI270539v1
no peaks find on chr4_GL000008v2_random
no peaks find on chrUn_KI270583v1
no peaks find on chrUn_KI270466v1
no peaks find on chr22_KI270732v1_random
no peaks find on chrUn_GL000216v2
Traceback (most recent call last):
  File "/home/$USER/.local/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3080, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 4554, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 4562, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'query_value'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/$USER/.local/bin/fseq2", line 428, in <module>
    main(temp_dir_name)
  File "/home/$USER/.local/bin/fseq2", line 49, in main
    submain(args)
  File "/home/$USER/.local/lib/python3.8/site-packages/fseq2/callpeak_main.py", line 190, in main
    result_df = fseq2.interpolate_poisson_p_value(result_df)
  File "/home/$USER/.local/lib/python3.8/site-packages/fseq2/fseq2.py", line 864, in interpolate_poisson_p_value
    result_df['query_int'] = result_df['query_value'].astype(int)
  File "/home/$USER/.local/lib/python3.8/site-packages/pandas/core/frame.py", line 3024, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/home/$USER/.local/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3082, in get_loc
    raise KeyError(key) from err
KeyError: 'query_value'
nsamzhao commented 3 years ago

Thanks for reporting the issue. Seems like because no peaks were called, then the empty dataframe did not have the corresponding column. I can better guard this in the next version.