NBISweden / aMeta

Ancient microbiome snakemake workflow
MIT License
19 stars 15 forks source link

no full pathogen list pathogensFound.very_inclusive.tab is provided in aMeta repo #96

Closed LeandroRitter closed 1 year ago

LeandroRitter commented 1 year ago

We need to include the full pathogensFound.very_inclusive.tab file to aMeta/resources. The file is not heavy and can be downloaded from https://doi.org/10.17044/scilifelab.21185887. Currently, there is a short version of that file in ./test/resources, but we need to provide the whole list of pathogens for the users for them to run it on real projects

ZoePochon commented 1 year ago

I noticed that not only the pathogensFound.very_inclusive.tab is necessary for the pipeline but it also complains when I don't provide this: pathogenome_seqid2taxid_db: /proj/nobackup/metagenomics/databases/PathoGenome/seqid2taxid.pathogen.map And this file is heavier though like 177MB. Can that also be added to the GitHub ?

Proof of failing: WorkflowError in line 41 of /proj/nobackup/metagenomics/pochonz/aMeta/workflow/rules/common.smk: Error validating config file. ValidationError: 'pathogenome_seqid2taxid_db' is a required property

This pathogenome_seqid2taxid_db seem to have been forgotten in the README explanation for necessary databases too. Or maybe it is wrongly name cause I don't have any "pathogenome_path" in my config file

LeandroRitter commented 1 year ago

Hmm, right @ZoePochon, perhaps better to provide a detailed documentation and advise users to download the files from the PathoGenome deposited at Figshare https://doi.org/10.17044/scilifelab.21185887

LeandroRitter commented 1 year ago

The issue is resolved with the new PR