BenMinch / PIGv

Pipeline for the identification of giant virus genomes from metagenomic datasets
MIT License
2 stars 0 forks source link

Screening bins erro #2

Open XiaomengWang0413 opened 3 months ago

XiaomengWang0413 commented 3 months ago

Hello, thanks for this tool!I have encountered the following problems during operation. May I ask what causes them and how can I solve them?: Screening bins cp: cannot stat 'EQ-6-2000m-vDNA/04_Viral_recall/viral_recall_out/batch_summary.txt/*.summary.tsv': Not a directory scripts/viralrecall_post_screen.py:25: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy data['Virus'][i]= 'no' scripts/viralrecall_post_screen.py:26: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy data['Sum_score'][i]= data2['score'].sum() scripts/viralrecall_post_screen.py:28: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy data['Contamination'][i]= len(data2[data2['score'] < 0]) scripts/viralrecall_post_screen.py:29: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy data['Total_contigs'][i]= len(data2) scripts/viralrecall_post_screen.py:30: SettingWithCopyWarning: See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy data['Sum_score'][i]= data2['score'].sum() scripts/viralrecall_post_screen.py:28: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy data['Contamination'][i]= len(data2[data2['score'] < 0]) scripts/viralrecall_post_screen.py:29: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy data['Total_contigs'][i]= len(data2) scripts/viralrecall_post_screen.py:30: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy data['Percent_negative'][i]=data['Contamination'][i]/data['Total_contigs'][i] scripts/viralrecall_post_screen.py:36: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy data['PolB'][i]= 'yes' Traceback (most recent call last): File "/home/xiaomeng/miniconda3/envs/PIGV/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3361, in get_loc return self._engine.get_loc(casted_key) File "pandas/_libs/index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'GVOGm0003'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "scripts/viralrecall_post_screen.py", line 39, in if data['GVOGm0003'][i] > 0: File "/home/xiaomeng/miniconda3/envs/PIGV/lib/python3.7/site-packages/pandas/core/frame.py", line 3458, in getitem indexer = self.columns.get_loc(key) File "/home/xiaomeng/miniconda3/envs/PIGV/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3363, in get_loc raise KeyError(key) from err KeyError: 'GVOGm0003' mv: cannot stat 'viralrecall_batch_summary.tsv': No such file or directory Traceback (most recent call last): File "PIGv.py", line 125, in viral_screen=pd.read_csv(output+'/05_Viral_screening/viralrecall_batch_summary.tsv', sep='\t') File "/home/xiaomeng/miniconda3/envs/PIGV/lib/python3.7/site-packages/pandas/util/_decorators.py", line 311, in wrapper return func(*args, kwargs) File "/home/xiaomeng/miniconda3/envs/PIGV/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 586, in read_csv return _read(filepath_or_buffer, kwds) File "/home/xiaomeng/miniconda3/envs/PIGV/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 482, in _read parser = TextFileReader(filepath_or_buffer, kwds) File "/home/xiaomeng/miniconda3/envs/PIGV/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 811, in init self._engine = self._make_engine(self.engine) File "/home/xiaomeng/miniconda3/envs/PIGV/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 1040, in _make_engine return mapping[engine](self.f, **self.options) # type: ignore[call-arg]

File "/home/xiaomeng/miniconda3/envs/PIGV/lib/python3.7/site-packages/pandas/util/_decorators.py", line 311, in wrapper return func(*args, kwargs) File "/home/xiaomeng/miniconda3/envs/PIGV/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 586, in read_csv return _read(filepath_or_buffer, kwds) File "/home/xiaomeng/miniconda3/envs/PIGV/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 482, in _read parser = TextFileReader(filepath_or_buffer, kwds) File "/home/xiaomeng/miniconda3/envs/PIGV/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 811, in init self._engine = self._make_engine(self.engine) File "/home/xiaomeng/miniconda3/envs/PIGV/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 1040, in _make_engine return mapping[engine](self.f, **self.options) # type: ignore[call-arg] File "/home/xiaomeng/miniconda3/envs/PIGV/lib/python3.7/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 51, in init self._open_handles(src, kwds) File "/home/xiaomeng/miniconda3/envs/PIGV/lib/python3.7/site-packages/pandas/io/parsers/base_parser.py", line 229, in _open_handles errors=kwds.get("encoding_errors", "strict"), File "/home/xiaomeng/miniconda3/envs/PIGV/lib/python3.7/site-packages/pandas/io/common.py", line 707, in get_handle newline="", FileNotFoundError: [Errno 2] No such file or directory: 'EQ-6-2000m-vDNA/05_Viral_screening/viralrecall_batch_summary.tsv'

BenMinch commented 3 months ago

This error occurs when none of the viral bins pass screening. There could still be partial bins if you see them in the ncldv_good_bins folder.

XiaomengWang0413 commented 3 months ago

This error occurs when none of the viral bins pass screening. There could still be partial bins if you see them in the ncldv_good_bins folder. Thank you for your timely reply. There are 10 genomes larger than 20kbp in my ncldv_good_bins. Are these genomes unable to pass the Screening bins? Best

XiaomengWang0413 commented 3 months ago

Maybe the reasult have some erro:cp: cannot stat 'EQ-6-2000m-vDNA/04_Viral_recall/viral_recall_out/batch_summary.txt/*.summary.tsv': Not a directory

BenMinch commented 3 months ago

The screening looks not only for length, but also for key marker genes as well as a positive viralrecall score (hits to known giant virus protein families). A negative score or lack of enough marker genes will cause it to not pass screening.

XiaomengWang0413 commented 3 months ago

The screening looks not only for length, but also for key marker genes as well as a positive viralrecall score (hits to known giant virus protein families). A negative score or lack of enough marker genes will cause it to not pass screening.

OK, Thank you for your timely reply.