WrightonLabCSU / DRAM

Distilled and Refined Annotation of Metabolism: A tool for the annotation and curation of function for microbial and viral genomes
GNU General Public License v3.0
252 stars 52 forks source link

Dram-v on Virsorter2 output: pandas error, unexpected EOF, syntax error #363

Open deminatanja opened 2 weeks ago

deminatanja commented 2 weeks ago

Hi,

Not sure if this issue was already asked about, please direct me to that thread if so. I was trying to run Dram-v on Virsorter2 output, but got an error, the command used and the end of my log are below:

DRAM-v.py annotate -i  sea_ice_final-viral-combined-for-dramv.fa \
-v sea_ice_viral-affi-contigs-for-dramv.tab \
-o sea_ice_dramv_out \
--threads $SLURM_CPUS_PER_TASK"
2024-10-10 14:19:49,918 - Retrieved database locations and descriptions
2024-10-10 14:19:49,918 - Annotating sea_ice_final-viral-combined-for-dramv
2024-10-10 14:20:09,555 - Turning genes from prodigal to mmseqs2 db
2024-10-10 14:20:11,419 - Getting hits from kofam
2024-10-10 14:31:26,510 - No KEGG source provided so distillation will be of limited use.
2024-10-10 14:31:26,510 - Getting forward best hits from viral
2024-10-10 14:31:42,967 - Getting reverse best hits from viral
2024-10-10 14:31:52,399 - Getting descriptions of hits from viral
2024-10-10 14:32:03,126 - Getting forward best hits from peptidase
2024-10-10 14:32:18,819 - Getting reverse best hits from peptidase
2024-10-10 14:32:21,284 - Getting descriptions of hits from peptidase
2024-10-10 14:32:24,196 - Getting hits from pfam
2024-10-10 14:33:54,023 - Getting hits from dbCAN
/usr/local/lib/python3.11/site-packages/mag_annotator/database_handler.py:218: UserWarning: No descriptions were found for your id's. Does this GT2_Glycos_transf_2 look like an id from dbcan_description
  warnings.warn(
2024-10-10 14:34:06,175 - Getting hits from VOGDB
2024-10-10 14:52:47,521 - Merging ORF annotations
2024-10-10 14:53:45,531 - No rRNAs were detected, no rrnas.tsv file will be created.
2024-10-10 14:53:56,973 - Annotations complete, processing annotations
/usr/local/lib/python3.11/site-packages/mag_annotator/annotate_vgfs.py:189: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  virsorter_genes['start_position'] = virsorter_genes['start_position'].astype(int)
/usr/local/lib/python3.11/site-packages/mag_annotator/annotate_vgfs.py:190: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  virsorter_genes['end_position'] = virsorter_genes['end_position'].astype(int)
/var/spool/slurmd/job23786494/slurm_script: line 27: unexpected EOF while looking for matching `"'
/var/spool/slurmd/job23786494/slurm_script: line 30: syntax error: unexpected end of file

Could you please help to troubleshoot?

Best, Tatiana

yshengzhi commented 1 week ago

I have the same problem. I tried to modify line 189-192 of the script “annotate_vgfs.py”.

As follows: virsorter_genes_copy = virsorter_genes.copy() virsorter_genes_copy['start_position'] = virsorter_genes_copy['start_position'].astype(int) virsorter_genes_copy['end_position'] = virsorter_genes_copy['end_position'].astype(int) virsorter_genes = virsorter_genes_copy.sort_values('start_position') virsorter_gene_number = 0

There was no error reported.