metagenome-atlas / Tutorial

A tutorial for Metagenome-Atlas
GNU General Public License v3.0
23 stars 13 forks source link

Unable to open 'Genecatalog/protein_catalog/db' #6

Closed Sewunet-Abera closed 2 years ago

Sewunet-Abera commented 3 years ago

The past issue was solved by signing 500gb of memory and 50 jobs. It went smoothly till I got this error.

rule get_rep_proteins: input: Genecatalog/all_genes/predicted_genes, Genecatalog/clustering/mmseqs output: Genecatalog/orf2gene_oldnames.tsv, Genecatalog/protein_catalog, Genecatalog/representatives_of_clusters.fasta log: logs/Genecatalog/clustering/get_rep_proteins.log jobid: 140 threads: 50 resources: tmpdir=/tmp, mem=500, mem_mb=500000, time=5

Activating conda environment: /home/nioo/sewuneta/sorghum_shotgun_metagenome_analysis/atlas.trial/atlas/databases/conda_envs/80a2955775f066104b58bf5b10a3ed68 [Tue Sep 14 07:47:46 2021] Error in rule get_rep_proteins: jobid: 140 output: Genecatalog/orf2gene_oldnames.tsv, Genecatalog/protein_catalog, Genecatalog/representatives_of_clusters.fasta log: logs/Genecatalog/clustering/get_rep_proteins.log (check log file(s) for error message) conda-env: /home/nioo/sewuneta/sorghum_shotgun_metagenome_analysis/atlas.trial/atlas/databases/conda_envs/80a2955775f066104b58bf5b10a3ed68 shell:

        mmseqs createtsv Genecatalog/all_genes/predicted_genes/inputdb Genecatalog/all_genes/predicted_genes/inputdb Genecatalog/clustering/mmseqs/clusterdb Genecatalog/orf2gene_oldnames.tsv  &> logs/Genecatalog/clustering/get_rep_proteins.log

        mkdir Genecatalog/protein_catalog 2>> logs/Genecatalog/clustering/get_rep_proteins.log

        mmseqs result2repseq Genecatalog/all_genes/predicted_genes/inputdb Genecatalog/clustering/mmseqs/clusterdb Genecatalog/protein_catalog/db  &>> logs/Genecatalog/clustering/get_rep_proteins.log

        mmseqs result2flat Genecatalog/all_genes/predicted_genes/inputdb Genecatalog/all_genes/predicted_genes/inputdb Genecatalog/protein_catalog/db Genecatalog/representatives_of_clusters.fasta  &>> logs/Genecatalog/clustering/get_rep_proteins.log

    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Removing output files of failed job get_rep_proteins since they might be corrupted: Genecatalog/protein_catalog Job failed, going on with independent jobs. Exiting because a job execution failed. Look above for error message Note the path to the log file for debugging. Documentation is available at: https://metagenome-atlas.readthedocs.io Issues can be raised at: https://github.com/metagenome-atlas/atlas/issues Complete log: /home/nioo/sewuneta/sorghum_shotgun_metagenome_analysis/atlas.trial/atlas/.snakemake/log/2021-09-14T074744.372307.snakemake.log [2021-09-14 07:47 CRITICAL] Command 'snakemake --snakefile /home/nioo/sewuneta/.conda/envs/atlasenv/lib/python3.8/site-packages/atlas/Snakefile --directory /home/nioo/sewuneta/sorghum_shotgun_metagenome_analysis/atlas.trial/atlas --jobs 80 --rerun-incomplete --configfile '/home/nioo/sewuneta/sorghum_shotgun_metagenome_analysis/atlas.trial/atlas/config.yaml' --nolock --use-conda --conda-prefix /home/nioo/sewuneta/sorghum_shotgun_metagenome_analysis/atlas.trial/atlas/databases/conda_envs --scheduler greedy all --keep-going ' returned non-zero exit status 1. (atlasenv) sewuneta@nioo0003:~/sorghum_shotgun_metagenome_analysis/atlas.trial/atlas$

And the out put in the log file is Program call: Genecatalog/all_genes/predicted_genes/inputdb Genecatalog/all_genes/predicted_genes/inputdb Genecatalog/clustering/mmseqs/clusterdb Genecatalog/orf2gene_oldnames.tsv

MMseqs Version: 3.be8f6 first sequence as respresentative false

Query file is Genecatalog/all_genes/predicted_genes/inputdb Data file is Genecatalog/clustering/mmseqs/clusterdb Could not open data file Genecatalog/clustering/mmseqs/clusterdb! Program call: Genecatalog/all_genes/predicted_genes/inputdb Genecatalog/clustering/mmseqs/clusterdb Genecatalog/protein_catalog/db

MMseqs Version: 3.be8f6 Threads 80 Verbosity 3

Could not open data file Genecatalog/clustering/mmseqs/clusterdb! Program call: Genecatalog/all_genes/predicted_genes/inputdb Genecatalog/all_genes/predicted_genes/inputdb Genecatalog/protein_catalog/db Genecatalog/representatives_of_clusters.fasta

MMseqs Version: 3.be8f6 Use fasta header false Verbosity 3

Query file is Genecatalog/all_genes/predicted_genes/inputdb Target file is Genecatalog/all_genes/predicted_genes/inputdb Data file is Genecatalog/protein_catalog/db Could not open data file Genecatalog/protein_catalog/db! logs/Genecatalog/clustering/get_rep_proteins.log (END)

Could pls help me resolve it? (Note that I working my way in to get to know Atlas on the tutorial data and want to use it on my 300gb shotgun data) Thanks.

SilasK commented 2 years ago

Note: I'm updating atlas quite a bit do you want to test the dev version?

Your problem has something to do with how you specify the resources. Did you follow the docs on single machine execution? What was the atlas command you run?

It seems you soecified way to much resources in the config file.

Sewunet-Abera commented 2 years ago

Hi Silas, Initially it was: threads: 50 mem: 500

threads and memory for jobs needing high amount of memory. e.g GTDB-tk,checkm or assembly

large_mem: 500 large_threads: 50 assembly_threads: 50 assembly_memory: 500 But now I scaled it down to half and hope that's fine. I'm happy to use the dev version if there are newer things I can try.

Thanks

From: Silas Kieser @.> Sent: Wednesday, September 15, 2021 9:20 AM To: metagenome-atlas/Tutorial @.> Cc: AberaDinke, Sewunet @.>; Author @.> Subject: Re: [metagenome-atlas/Tutorial] Unable to open 'Genecatalog/protein_catalog/db' (#6)

Note: I'm updating atlas quite a bit do you want to test the dev version?

Your problem has something to do with how you specify the resources. Did you follow the docs on single machine execution? What was the atlas command you run?

It seems you soecified way to much resources in the config file.

- You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmetagenome-atlas%2FTutorial%2Fissues%2F6%23issuecomment-919764822&data=04%7C01%7C%7Ca8acead78cc24c4fa69a08d978193c93%7Cc7a0410b695b4223af74d14c2e7652e8%7C0%7C0%7C637672872056819012%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=rRqJkV1BW4T%2Fy0J6q7bQEfkTntQ2V1NpgvHbLTXJdTQ%3D&reserved=0, or unsubscribehttps://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAFGBPLDQG7HPUIAAXFAGT3TUCBCKFANCNFSM5D7LAXRQ&data=04%7C01%7C%7Ca8acead78cc24c4fa69a08d978193c93%7Cc7a0410b695b4223af74d14c2e7652e8%7C0%7C0%7C637672872056829010%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=cVf517zrEoRDdp0TBsM1ZU0v59u4pqjQnsGV%2F%2Bu4vhc%3D&reserved=0. Triage notifications on the go with GitHub Mobile for iOShttps://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7C%7Ca8acead78cc24c4fa69a08d978193c93%7Cc7a0410b695b4223af74d14c2e7652e8%7C0%7C0%7C637672872056839006%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=G6Kw8RZfOJCp8B0kiLyzEWiD%2FRM3Pt9PNJRpfn%2BIwoc%3D&reserved=0 or Androidhttps://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26referrer%3Dutm_campaign%253Dnotification-email%2526utm_medium%253Demail%2526utm_source%253Dgithub&data=04%7C01%7C%7Ca8acead78cc24c4fa69a08d978193c93%7Cc7a0410b695b4223af74d14c2e7652e8%7C0%7C0%7C637672872056839006%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=fnI4UuDqnLcJTrm2dHW0hKOs2hoC%2FyAGXSuvLx53c40%3D&reserved=0.

SilasK commented 2 years ago

Please go back to something closer to the default values. check the logfile and if there is a problem of memory you can increase to 100gb mem. check if you execute atlas correctly.

Sewunet-Abera commented 2 years ago

Thanks Kieser, Took me a while with few trials back and forth, but now it is running smoothly. Did QC and now I'm on the assembly step.

Greetings

From: Silas Kieser @.> Sent: Friday, September 17, 2021 2:56 PM To: metagenome-atlas/Tutorial @.> Cc: AberaDinke, Sewunet @.>; Author @.> Subject: Re: [metagenome-atlas/Tutorial] Unable to open 'Genecatalog/protein_catalog/db' (#6)

Please go back to something closer to the default values. check the logfile and if there is a problem of memory you can increase to 100gb mem. check if you execute atlas correctly.

- You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmetagenome-atlas%2FTutorial%2Fissues%2F6%23issuecomment-921774433&data=04%7C01%7C%7C2f46619f38d0431b104808d979da73ec%7Cc7a0410b695b4223af74d14c2e7652e8%7C0%7C0%7C637674801433295228%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=n%2BdhPdgJgfvcD97mjGMJ3crCVrDkP4NCzAwU06M6hhY%3D&reserved=0, or unsubscribehttps://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAFGBPLGOQWVG5VJYZGJRLNLUCM3EVANCNFSM5D7LAXRQ&data=04%7C01%7C%7C2f46619f38d0431b104808d979da73ec%7Cc7a0410b695b4223af74d14c2e7652e8%7C0%7C0%7C637674801433305221%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=i3lBytt67y%2BSRD2mk8gQ%2FBSVdN%2F%2FPFqmNUbuZvS2LNQ%3D&reserved=0. Triage notifications on the go with GitHub Mobile for iOShttps://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7C%7C2f46619f38d0431b104808d979da73ec%7Cc7a0410b695b4223af74d14c2e7652e8%7C0%7C0%7C637674801433315223%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=%2BlRFAN%2BJq99sH%2FZhhYrb1lUXre5tfrFrwkF%2FUiYT%2Bks%3D&reserved=0 or Androidhttps://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26referrer%3Dutm_campaign%253Dnotification-email%2526utm_medium%253Demail%2526utm_source%253Dgithub&data=04%7C01%7C%7C2f46619f38d0431b104808d979da73ec%7Cc7a0410b695b4223af74d14c2e7652e8%7C0%7C0%7C637674801433315223%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=9zG2UATi8GkJD7U90P5pMhi0OGmuOQ2r9wEqc6UsaUw%3D&reserved=0.