GenomicsAotearoa / metagenomics_summer_school

Course materials for the Genomics Aotearoa Metagenomics Summer School, to be hosted at the University of Auckland in Septermber
https://genomicsaotearoa.github.io/metagenomics_summer_school/
GNU General Public License v3.0
53 stars 30 forks source link

`vcontact2_tax_predict.py` not working on vConTACT2 output #59

Closed JSBoey closed 3 months ago

JSBoey commented 3 months ago

Missing VC Subcluster column, but is something weird with the fact it cannot find Genome? The output for vConTACT2 is here: /nesi/nobackup/nesi02659/MGSS_2024/precomputed_files/7.viruses

Traceback (most recent call last):
  File "./vcontact2_tax_predict.py", line 56, in <module>
    vcontact2_results_sub  = vcontact2_results[['Genome', 'Order_VC_predicted', 'Family_VC_predicted', 'Genus_VC_predicted', 'VC', 'VC Subcluster', 'VC Status']]
  File "/opt/nesi/CS400_centos7_bdw/Python/3.8.2-gimkl-2020a/lib/python3.8/site-packages/pandas-1.0.1-py3.8-linux-x86_64.egg/pandas/core/frame.py", line 2806, in __getitem__
    indexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1]
  File "/opt/nesi/CS400_centos7_bdw/Python/3.8.2-gimkl-2020a/lib/python3.8/site-packages/pandas-1.0.1-py3.8-linux-x86_64.egg/pandas/core/indexing.py", line 1551, in _get_listlike_indexer
    self._validate_read_indexer(
  File "/opt/nesi/CS400_centos7_bdw/Python/3.8.2-gimkl-2020a/lib/python3.8/site-packages/pandas-1.0.1-py3.8-linux-x86_64.egg/pandas/core/indexing.py", line 1645, in _validate_read_indexer
    raise KeyError(f"{not_found} not in index")
KeyError: "['VC Subcluster', 'Genome'] not in index"
mlhoggard commented 3 months ago

This is from the appendix docs yeah? If you've re-generated the vConTACT2 outputs, which version of vConTACT2 did you use? Later versions updated the output files in a way that broke this script, but I have an updated version I can update here if we're clear in the docs about the vConTACT2 version.

mlhoggard commented 3 months ago

Updated to tax_predict_vConTACT2_0.9.19.py (added to scripts/)