vrmarcelino / CCMetagen

Microbiome classification pipeline
GNU General Public License v3.0
64 stars 19 forks source link

CCMetagen_merge input suffix discrepancy #25

Closed mihkelvaher closed 3 years ago

mihkelvaher commented 3 years ago

Hi!

README.md states Where $CCMetagen_out is the folder containing the CCMetagen taxonomic classifications. The results must be in .csv format (default or '--mode text' output of CCMetagen), and these files must end in ".ccm.csv".

If I have a file kma.res.ccm.csv in the input dir I get KeyError: "None of [Index(['Superkingdom', 'Kingdom', 'Phylum', 'Class', 'Order', 'Family',\n 'Genus', 'Species'],\n dtype='object')] are in the [columns]" - probably because no files were found and the df was empty.

After some trying and looking at the tutorial, the suffix needs to be res.csv, which produces the output file.

On this line: https://github.com/vrmarcelino/CCMetagen/blob/master/CCMetagen_merge.py#L90 this suffix is used: .res.ccm.csv which is not mentioned in the manual.

Best, Mihkel

vrmarcelino commented 3 years ago

Hi Mihkel,

Thanks for your message. Using the suffix .res.ccm.csv was a relatively recent feature, so if you are using earlier versions of CCMetagen, you would just need files ending in "res.csv".

Does updating fixes the issue?

Thanks!

mihkelvaher commented 3 years ago

Hi!

I was using the version from pip, the one git master works as expected 👍 Also, there seems to be a slight confusion of versions in pip, but it might get resolved if the pip version is updated:

...
Successfully installed CCMetagen-1.2.4 ete3-3.1.2 pandas-1.2.4 python-dateutil-2.8.1 pytz-2021.1
(myCondaEnv) -bash-4.2$ CCMetagen.py --version
v1.2.3

Thanks

vrmarcelino commented 3 years ago

Great! Right, the pip version is slightly behind, sorry about that. Something we need to update!

Thanks for persisting!