meglab-metagenomics / amrplusplus_v2

MEGARes and AmrPlusPlus - A comprehensive database of antimicrobial resistance genes and user-friendly pipeline for analysis of high-throughput sequencing data
http://megares.meglab.org/
MIT License
25 stars 15 forks source link

problem running RGI #5

Closed abissett closed 2 years ago

abissett commented 4 years ago

Hi, I'm trying to run amr++ with the RGI option, but am having some problems. When I run the images that don't use RGI (meglab-metagenomics-amrplusplus_v2.img with main_AmrPlusPlus_v2.nf main_AmrPlusPlus_v2_withKraken.nf) amr++ completes. When I run any of the RGI workflows (meglab-metagenomics-amrplusplus_v2-rgi.img with main_AmrPlusPlus_v2_withRGI_Kraken.nf or main_AmrPlusPlus_v2_withRGI.nf) singularity is unable to create the container and fails:

nextflow run main_AmrPlusPlus_v2_withRGI.nf -profile singularity --card_db data/card/card.json --output AMR++_results_RGI -w work_dir_AMR++_RGI N E X T F L O W ~ version 20.04.1 Launching main_AmrPlusPlus_v2_withRGI.nf [goofy_tesla] - revision: 77e0805d3d executor > local (20) [5e/07ebfb] process > RunQC (63172_S12) [100%] 1 of 1 ✔ [14/19329d] process > QCStats (null) [100%] 1 of 1 ✔ [30/580ab1] process > BuildHostIndex (chr21.fasta) [100%] 1 of 1 ✔ [be/1b47d2] process > AlignReadsToHost (63172_S12) [100%] 1 of 1 ✔ [2e/768fb9] process > RemoveHostDNA (63172_S12) [100%] 1 of 1 ✔ [25/536b8a] process > HostRemovalStats (null) [100%] 1 of 1 ✔ [95/dabae0] process > NonHostReads (63172_S12) [100%] 1 of 1 ✔ [b1/c35023] process > BuildAMRIndex (megares_modified_database_v2.00) [100%] 1 of 1 ✔ [d1/e40461] process > AlignToAMR (63172_S12) [100%] 1 of 1 ✔ [44/691252] process > RunResistome (63172_S12) [100%] 1 of 1 ✔ [81/d99817] process > ResistomeResults (null) [100%] 1 of 1 ✔ [0c/56543b] process > SamDedupRunResistome (63172_S12) [100%] 1 of 1 ✔ [1f/ed5fe0] process > SamDedupResistomeResults (null) [100%] 1 of 1 ✔ [bf/eb3522] process > RunRarefaction (63172_S12) [100%] 1 of 1 ✔ [34/898eee] process > ExtractSNP (63172_S12) [100%] 1 of 1 ✔ [19/7d3fc1] process > RunRGI (63172_S12) [100%] 1 of 1, failed: 1 ✔ [- ] process > SNPconfirmation - [- ] process > Confirmed_AMR_hits - [08/2e47c7] process > Confirmed_ResistomeResults (null) [100%] 1 of 1, failed: 1 ✔ [44/cb140b] process > ExtractDedupSNP (63172_S12) [100%] 1 of 1 ✔ [e4/ee6555] process > RunDedupRGI (63172_S12) [100%] 1 of 1, failed: 1 ✔ [- ] process > DedupSNPconfirmation - [- ] process > ConfirmDedupAMRHits - [fe/ebad9c] process > DedupSNPConfirmed_ResistomeResults (null) [100%] 1 of 1, failed: 1 ✔

Completed at: 11-Jun-2020 13:39:53 Duration : 2h 9m 1s CPU hours : 2.8 (0% failed) Succeeded : 16 Ignored : 4 Failed : 4

FATAL: container creation failed: mount /proc/self/fd/3->/apps/singularity/3.5.0/var/singularity/mnt/session/rootfs error: while mounting image /proc/self/fd/3: kernel reported a bad superblock for squashfs image partition, possible causes are that your kernel doesn't support the compression algorithm or the image is corrupted

I've tried downloading the repo several times, but I'm continuing to get the same error.

I am also able to run an interactive session with meglab-metagenomics-amrplusplus_v2.img, but not with meglab-metagenomics-amrplusplus_v2-rgi.img :

singularity run -i meglab-metagenomics-amrplusplus_v2-rgi.img

FATAL: container creation failed: mount /proc/self/fd/3->/apps/singularity/3.5.0/var/singularity/mnt/session/rootfs error: while mounting image /proc/self/fd/3: kernel reported a bad superblock for squashfs image partition, possible causes are that your kernel doesn't support the compression algorithm or the image is corrupted

Any help appreciated.

Thanks,

Andrew

meglab-metagenomics commented 4 years ago

Hello Andrew,

Thanks for sharing your issue and for using AMR++.

I saw your post last week and was getting the same error you described when I tried downloading the singularity container from the AMR++ singularity hub. I started trying to troubleshoot that issue and wasn't making much progress, but then I tried downloading the singularity container today and it now seems to work again.

Could you please erase the container and try running AMR++ again?

Let me know how it goes and we can keep troubleshooting from there.

Thanks! Enrique and the Microbial Ecology Group

abissett commented 4 years ago

Thanks Enrique!

OK, so pulling the container from the singularity hub seems to help get things further (and fix the error above). When I clone it from the git repo as instructed in the read me the image is corrupted (I'm not sure why).

I now have another error:

N E X T F L O W ~ version 20.04.1 Launching main_AmrPlusPlus_v2_withRGI.nf [disturbed_ritchie] - revision: 77e0805d3d Read pair files could not be found: ../63193_R{1,2}.fastq.gz bis068@pearcey-i3:/scratch1/bis068/amrplusplus/amrplusplus_v2> nextflow run main_AmrPlusPlus_v2_withRGI.nf -profile singularity --card_db data/card/card.json --reads --reads "data2/63193_R{1,2}.fastq.gz" --output 63193_2_out -w 63193_work_2 --local --debug N E X T F L O W ~ version 20.04.1 Launching main_AmrPlusPlus_v2_withRGI.nf [nice_newton] - revision: 77e0805d3d executor > local (19) executor > local (20) executor > local (20) executor > local (22) executor > local (22) executor > local (22) executor > local (22) [31/b0754b] process > RunQC (63193_S18) [100%] 1 of 1 ✔ [b0/0017bf] process > QCStats (null) [100%] 1 of 1 ✔ [23/705eda] process > BuildHostIndex (chr21.fasta) [100%] 1 of 1 ✔ [db/ff6d31] process > AlignReadsToHost (63193_S18) [100%] 1 of 1 ✔ [a1/1fc0fd] process > RemoveHostDNA (63193_S18) [100%] 1 of 1 ✔ [e4/d811c0] process > HostRemovalStats (null) [100%] 1 of 1 ✔ [eb/cd5246] process > NonHostReads (63193_S18) [100%] 1 of 1 ✔ [e9/b06969] process > BuildAMRIndex (megares_modified_database_v2.00) [100%] 1 of 1 ✔ [ba/aaff8c] process > AlignToAMR (63193_S18) [100%] 1 of 1 ✔ [e2/00eeb9] process > RunResistome (63193_S18) [100%] 1 of 1 ✔ [2f/0f6ce7] process > ResistomeResults (null) [100%] 1 of 1 ✔ [7c/2134d2] process > SamDedupRunResistome (63193_S18) [100%] 1 of 1 ✔ [0f/83407d] process > SamDedupResistomeResults (null) [100%] 1 of 1 ✔ [df/bf5704] process > RunRarefaction (63193_S18) [100%] 1 of 1 ✔ [c5/4326be] process > ExtractSNP (63193_S18) [100%] 1 of 1 ✔ [27/fbd8d4] process > RunRGI (63193_S18) [100%] 1 of 1 ✔ [c3/f92648] process > SNPconfirmation (63193_S18) [100%] 1 of 1, failed: 1 ✔ [- ] process > Confirmed_AMR_hits - [2b/49901b] process > Confirmed_ResistomeResults (null) [100%] 1 of 1, failed: 1 ✔ [68/c3dc11] process > ExtractDedupSNP (63193_S18) [100%] 1 of 1 ✔ [ac/541482] process > RunDedupRGI (63193_S18) [100%] 1 of 1 ✔ [2d/c1eeed] process > DedupSNPconfirmation (63193_S18) [100%] 1 of 1, failed: 1 ✔ [- ] process > ConfirmDedupAMRHits - [e4/11c899] process > DedupSNPConfirmed_ResistomeResults (null) [100%] 1 of 1, failed: 1 ✔ [2d/c1eeed] NOTE: Missing output file(s) 63193_S18_rgi_perfect_hits.csv expected by process DedupSNPconfirmation (63193_S18) -- Error is ignored [e4/11c899] NOTE: Process DedupSNPConfirmed_ResistomeResults (null) terminated with an error exit status (2) -- Error is ignored Completed at: 28-Jul-2020 00:35:49 Duration : 11h 6m 1s CPU hours : 19.8 (0% failed) Succeeded : 18 Ignored : 4 Failed : 4

The errors I can find in the *.err logs of the RGI steps are:

ac/541482a5d79371000f7a28c8a33f6b> cat .command.err INFO 2020-07-27 16:19:25,324 : { "card_json": "card.json", "card_annotation": null, "wildcard_annotation": null, "wildcard_index": null, "wildcard_version": null, "baits_annotation": null, "baits_index": null, "kmer_database": null, "amr_kmers": null, "kmer_size": null, "local_database": true, "debug": true } INFO 2020-07-27 16:19:34,552 : file card.json loaded ok INFO 2020-07-27 16:19:52,420 : { "card_canonical": { "data_version": "3.0.9" }, "card_variants": { "data_version": "N/A" }, "card_kmers": { "kmer_sizes": [] } } Traceback (most recent call last): File "/usr/local/envs/AmrPlusPlus_env/bin/rgi", line 4, in MainBase() File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/MainBase.py", line 81, in init getattr(self, args.command)() File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/MainBase.py", line 86, in main self.main_run(args) File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/MainBase.py", line 120, in main_run rgi_obj.run() File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/RGI.py", line 197, in run self.create_databases() File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/RGI.py", line 191, in create_databases db_obj.build_databases() File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/Database.py", line 25, in build_databases self.write_fasta_from_json_rna() File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/Database.py", line 166, in write_fasta_from_json_rna snpList = [j[i]['model_param']['snp']['param_value'][k] for k in j[i]['model_param']['snp']['param_value']] KeyError: 'snp'

and these:

[2d/c1eeed] NOTE: Missing output file(s) 63193_S18_rgi_perfect_hits.csv expected by process DedupSNPconfirmation (63193_S18) -- Error is ignored [e4/11c899] NOTE: Process DedupSNPConfirmed_ResistomeResults (null) terminated with an error exit status (2) -- Error is ignored

which I think refers to a missing file that should have been produced from the first one? I can't quite work out which data it's trying to call in to check on the key error?

Any help appreciated. Let me know if I can what other info i can provide.

meglab-metagenomics commented 4 years ago

Hello,

Could you try looking at one of the temporary directories where the RGI command failed (like where you found the ".err" files)? Navigate to that directory and look at the ".command.log" file that should give you some insight into what caused the error. You can also look at the ".command.sh" file which has the exact commands that were run if you'd like to try running RGI from that working directory. Make sure that the files being called by the ".command.sh" script are present in that working directory. This could help identify any issues upstream in the pipeline.

If you want to try re-running the command, keep in mind that you'd need to have the RGI tool in your $PATH for the command from .command.sh to work correctly. Since you are using the singularity container, you'd need to run the ".command.run" file which loads the correct environment, like this "bash .command.run".

Try that and let me know how it goes!

Best, Enrique and the MEG team

On Tue, Jul 28, 2020 at 12:27 AM abissett notifications@github.com wrote:

Thanks Enrique!

OK, so pulling the container from the singularity hub seems to help get things further (and fix the error above). When I clone it from the git repo as instructed in the read me the image is corrupted (I'm not sure why).

I now have another error:

N E X T F L O W ~ version 20.04.1 Launching main_AmrPlusPlus_v2_withRGI.nf [disturbed_ritchie] - revision: 77e0805d3d Read pair files could not be found: ../63193_R{1,2}.fastq.gz bis068@pearcey-i3:/scratch1/bis068/amrplusplus/amrplusplus_v2> nextflow run main_AmrPlusPlus_v2_withRGI.nf -profile singularity --card_db data/card/card.json --reads --reads "data2/63193_R{1,2}.fastq.gz" --output 63193_2_out -w 63193_work_2 --local --debug N E X T F L O W ~ version 20.04.1 Launching main_AmrPlusPlus_v2_withRGI.nf [nice_newton] - revision: 77e0805d3d executor > local (19) executor > local (20) executor > local (20) executor > local (22) executor > local (22) executor > local (22) executor > local (22) [31/b0754b] process > RunQC (63193_S18) [100%] 1 of 1 ✔ [b0/0017bf] process > QCStats (null) [100%] 1 of 1 ✔ [23/705eda] process > BuildHostIndex (chr21.fasta) [100%] 1 of 1 ✔ [db/ff6d31] process > AlignReadsToHost (63193_S18) [100%] 1 of 1 ✔ [a1/1fc0fd] process > RemoveHostDNA (63193_S18) [100%] 1 of 1 ✔ [e4/d811c0] process > HostRemovalStats (null) [100%] 1 of 1 ✔ [eb/cd5246] process > NonHostReads (63193_S18) [100%] 1 of 1 ✔ [e9/b06969] process > BuildAMRIndex (megares_modified_database_v2.00) [100%] 1 of 1 ✔ [ba/aaff8c] process > AlignToAMR (63193_S18) [100%] 1 of 1 ✔ [e2/00eeb9] process > RunResistome (63193_S18) [100%] 1 of 1 ✔ [2f/0f6ce7] process > ResistomeResults (null) [100%] 1 of 1 ✔ [7c/2134d2] process > SamDedupRunResistome (63193_S18) [100%] 1 of 1 ✔ [0f/83407d] process > SamDedupResistomeResults (null) [100%] 1 of 1 ✔ [df/bf5704] process > RunRarefaction (63193_S18) [100%] 1 of 1 ✔ [c5/4326be] process > ExtractSNP (63193_S18) [100%] 1 of 1 ✔ [27/fbd8d4] process > RunRGI (63193_S18) [100%] 1 of 1 ✔ [c3/f92648] process > SNPconfirmation (63193_S18) [100%] 1 of 1, failed: 1 ✔ [- ] process > Confirmed_AMR_hits - [2b/49901b] process > Confirmed_ResistomeResults (null) [100%] 1 of 1, failed: 1 ✔ [68/c3dc11] process > ExtractDedupSNP (63193_S18) [100%] 1 of 1 ✔ [ac/541482] process > RunDedupRGI (63193_S18) [100%] 1 of 1 ✔ [2d/c1eeed] process > DedupSNPconfirmation (63193_S18) [100%] 1 of 1, failed: 1 ✔ [- ] process > ConfirmDedupAMRHits - [e4/11c899] process > DedupSNPConfirmed_ResistomeResults (null) [100%] 1 of 1, failed: 1 ✔ [2d/c1eeed] NOTE: Missing output file(s) 63193_S18_rgi_perfect_hits.csv expected by process DedupSNPconfirmation (63193_S18) -- Error is ignored [e4/11c899] NOTE: Process DedupSNPConfirmed_ResistomeResults (null) terminated with an error exit status (2) -- Error is ignored Completed at: 28-Jul-2020 00:35:49 Duration : 11h 6m 1s CPU hours : 19.8 (0% failed) Succeeded : 18 Ignored : 4 Failed : 4

The errors I can find in the *.err logs of the RGI steps are:

ac/541482a5d79371000f7a28c8a33f6b> cat .command.err INFO 2020-07-27 16:19:25,324 : { "card_json": "card.json", "card_annotation": null, "wildcard_annotation": null, "wildcard_index": null, "wildcard_version": null, "baits_annotation": null, "baits_index": null, "kmer_database": null, "amr_kmers": null, "kmer_size": null, "local_database": true, "debug": true } INFO 2020-07-27 16:19:34,552 : file card.json loaded ok INFO 2020-07-27 16:19:52,420 : { "card_canonical": { "data_version": "3.0.9" }, "card_variants": { "data_version": "N/A" }, "card_kmers": { "kmer_sizes": [] } } Traceback (most recent call last): File "/usr/local/envs/AmrPlusPlus_env/bin/rgi", line 4, in MainBase() File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/MainBase.py", line 81, in init getattr(self, args.command)() File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/MainBase.py", line 86, in main self.main_run(args) File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/MainBase.py", line 120, in main_run rgi_obj.run() File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/RGI.py", line 197, in run self.create_databases() File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/RGI.py", line 191, in create_databases db_obj.build_databases() File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/Database.py", line 25, in build_databases self.write_fasta_from_json_rna() File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/Database.py", line 166, in write_fasta_from_json_rna snpList = [j[i]['model_param']['snp']['param_value'][k] for k in j[i]['model_param']['snp']['param_value']] KeyError: 'snp'

and these:

[2d/c1eeed] NOTE: Missing output file(s) 63193_S18_rgi_perfect_hits.csv expected by process DedupSNPconfirmation (63193_S18) -- Error is ignored [e4/11c899] NOTE: Process DedupSNPConfirmed_ResistomeResults (null) terminated with an error exit status (2) -- Error is ignored

which I think refers to a missing file that should have been produced from the first one? I can't quite work out which data it's trying to call in to check on the key error?

Any help appreciated. Let me know if I can what other info i can provide.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/meglab-metagenomics/amrplusplus_v2/issues/5#issuecomment-664804564, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM4CKBT2FESI4XSL6AWAXHDR5ZVWDANCNFSM4N4BJO7Q .

abissett commented 4 years ago

Thanks. It looks like the RGI call is running OK, although I do get the following errors: Run RGI the first time Traceback (most recent call last): File "/usr/local/envs/AmrPlusPlus_env/bin/rgi", line 4, in MainBase() File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/MainBase.py", line 81, in init getattr(self, args.command)() File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/MainBase.py", line 86, in main self.main_run(args) File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/MainBase.py", line 120, in main_run rgi_obj.run() File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/RGI.py", line 197, in run self.create_databases() File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/RGI.py", line 191, in create_databases db_obj.build_databases() File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/Database.py", line 25, in build_databases self.write_fasta_from_json_rna() File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/Database.py", line 166, in write_fasta_from_json_rna snpList = [j[i]['model_param']['snp']['param_value'][k] for k in j[i]['model_param']['snp']['param_value']] KeyError: 'snp' Run RGI again Process Process-1:4: Traceback (most recent call last): File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/Filter.py", line 121, in process_rrna res = rrna_obj.run() File "/usr/local/envs/AmrPlusPlus_env/lib/python3.6/site-packages/app/RrnaModel.py", line 113, in run query_snps = {"original": hsp.query[d], "change": hsp.sbjct[d], "position": (d - 1)} IndexError: string index out of range

I'm guessing they are OK?

The next step, however, doesn't run, with the following error: [d1/6de0c4] NOTE: Missing output file(s) 63193_S18_rgi_perfect_hits.csv expected by process SNPconfirmation (63193_S18) -- Error is ignored

When I check out the .command.sh, it's trying to run:

!/bin/bash -ue

python3 /scratch1/bis068/amrplusplus/amrplusplus_v2/bin/RGI_aro_hits.py 63193_S18_rgi_output.txt 63193_S18

I think the problem is in the creation of the file that is missing above. My filename is called "63193_S18_rgi_output.txt", so when this bit is run:

"sample_name = rgifile.split("") perf_sample_name = sample_name perf_sample_name.pop() perf_sample_name.pop() perf_sample_name.insert(1, '_rgi_perfect_hits.csv') perf_file_name = ''.join(perf_sample_name)"

the expected file name isn't produced, instead it produces "_rgi_perfect_hits.csv" as the file name. There are too many pops. Also I think you need to make a copy of the "sample_name" or the wrong variable of propogated as it is changed throughout....(e.g., perf_sample_name = sample_name.copy()

As a result the outputs in the call to the python script are not called the expected file. Interestingly the missing file (expected file) should be called "63193_S18_rgi_perfect_hits.csv" (or I guess possibly "63193_S18_rgi_strict_hits.csv" if the perfect is skipped if it doesn't exist?), but as written the script cannot produce this??????

As an aside, in this case there would never be a file called "perfect_hits" because "perf_in_dict = False". What is the expected behaviour in this case?

Again any help appreciated. I can send you my input fastq's or the working directory from this run if would be helpful?

Thanks

Andrew

abissett commented 4 years ago

OK. I've gone back to testing with the test data supplied with amr++ You can likely ignore my previous post, as I've added everything to this one in a clearer (I think) way. You should be able to re-create the below using that and nextflow output pasted at the bottom. The test dataset I was trying to use wasn't the best for testing, as it didn't contain any AMR that didn't require confirmation and after RGI it didn't provide any that fit the perfect model (all were strict). I'm not sure what the expected behaviour is in that case, but "fail" seems to be not that useful. The run may have worked, but found no hits under the Perfect model, for example. In that case perhaps a note/warning that "no hits were found" and a null table as output would be a better indicator that the "fail" is sample specific and not a workflow issue?

Using the test data as below I run into the same problem as I described. Essentially "missing file". I can see that there are a bunch of hits that don't have the "require confirmation" flag, and some that do, after the "run restistome" parts.

/RGI_testdata1/ResistomeResults> wc -l AMR_analytic_matrix.csv 105 AMR_analytic_matrix.csv

/RGI_testdata1/ResistomeResults> grep -c 'RequiresSNPConfirmation' AMR_analytic_matrix.csv 10

After RGI there are no "Perfect hits", only "loose"

/RGI_testdata1_work/45/7584378cc9fe777dbf91aa0a8376d7> grep -c 'loose' S3_test_rgi_output.txt 13654

RGI_testdata1_work/45/7584378cc9fe777dbf91aa0a8376d7> wc -l S3_test_rgi_output.txt 13655 S3_test_rgi_output.txt

Incidentally the file naming of the strict hits output fails to produce expected as detailed in my last post.........................................and the "perfect_hits.csv" isn't written as it is "False".

/RGI_testdata1_work/45/7584378cc9fe777dbf91aa0a8376d7> ls _rgi_strict_hits.csv S3_test_rgi_output.txt

I can get around the naming issue by removing the second "pop" and changing the split delimiter thus:

Get the name of the file to use for the three outputs

sample_name = rgi_file.split("_rgi_output")
perf_sample_name = sample_name
perf_sample_name.pop()
#perf_sample_name.pop()
perf_sample_name.insert(1, '_rgi_perfect_hits.csv')
perf_file_name = ''.join(perf_sample_name)

strict_sample_name = sample_name
strict_sample_name.pop()
strict_sample_name.insert(1, '_rgi_pefect_hits.csv')
strict_file_name = ''.join(strict_sample_name)

I'm assuming this is OK, But.......given there are no perfect hits there is no "perfect_hits.csv" created.

I'd still expect to see the run complete, given there are 95 lines in the "AMR_analytic_matrix.csv" that don't require snpconfirmation. Or am I missing something here?

When I try and trick the run into creating a 'perfect' output, by strict_sample_name = sample_name strict_sample_name.pop() strict_sample_name.insert(1, '_rgi_perfect_hits.csv strict_file_name = ''.join(strict_sample_name)

There is a problem with the RGI_long_combine.py part of the workflow.

The script is looking for a ',' delimiter in the "long_file", but this a .tsv (".gene.tsv"). When I change the code to the below it seems to work.

      long_reader = csv.reader(long_file, delimiter='\t')

The next part of process ('`Confirmed_ResistomeResults') also seems to have a delimiter call error :

[e3/cbb250] NOTE: Process Confirmed_ResistomeResults (null) terminated with an error exit status (1) -- Error is ignored

the code calls for: entry = entry.split('\t')

but the input file is ',' delimited. I wasn't able to get around this one because the .py is called by earlier steps which are \t delimited

Sorry for the long posts, it's tricky for me to work out exactly what's going on in the container and I find myself going around in circles a bit.

Thanks Andrew

/amrplusplus_v2> nextflow -Dcapsule.log=verbose run main_AmrPlusPlus_v2_withRGI.nf --reads "data/raw/*_R{1,2}.fastq.gz" -profile singularity --card_db data/card/card.json --output RGI_testdata1 -w RGI_testdata1_work --local --debug N E X T F L O W ~ version 20.04.1 Launching main_AmrPlusPlus_v2_withRGI.nf [curious_cajal] - revision: 77e0805d3d executor > local (44) executor > local (44) [57/f23a75] process > RunQC (S3_test) [100%] 3 of 3 ✔ executor > local (45) executor > local (46) executor > local (46) [57/f23a75] process > RunQC (S3_test) [100%] 3 of 3 ✔ executor > local (48) executor > local (48) [57/f23a75] process > RunQC (S3_test) [100%] 3 of 3 ✔ executor > local (49) [57/f23a75] process > RunQC (S3_test) [100%] 3 of 3 ✔ executor > local (50) [57/f23a75] process > RunQC (S3_test) [100%] 3 of 3 ✔ [6c/4c946e] process > QCStats (null) [100%] 1 of 1 ✔ executor > local (50) [57/f23a75] process > RunQC (S3_test) [100%] 3 of 3 ✔ [6c/4c946e] process > QCStats (null) [100%] 1 of 1 ✔ executor > local (50) [57/f23a75] process > RunQC (S3_test) [100%] 3 of 3 ✔ [6c/4c946e] process > QCStats (null) [100%] 1 of 1 ✔ [56/6310c1] process > BuildHostIndex (chr21.fasta) [100%] 1 of 1 ✔ [be/025738] process > AlignReadsToHost (S3_test) [100%] 3 of 3 ✔ [d6/718b6f] process > RemoveHostDNA (S3_test) [100%] 3 of 3 ✔ [db/324e68] process > HostRemovalStats (null) [100%] 1 of 1 ✔ [d9/62d2f7] process > NonHostReads (S3_test) [100%] 3 of 3 ✔ [71/ebe3cf] process > BuildAMRIndex (megares_modified_database_v2.00) [100%] 1 of 1 ✔ [81/74bac7] process > AlignToAMR (S3_test) [100%] 3 of 3 ✔ [b3/9ac0f8] process > RunResistome (S3_test) [100%] 3 of 3 ✔ [51/421d57] process > ResistomeResults (null) [100%] 1 of 1 ✔ [7f/ebf49d] process > SamDedupRunResistome (S3_test) [100%] 3 of 3 ✔ [ab/8ca86d] process > SamDedupResistomeResults (null) [100%] 1 of 1 ✔ [2c/e8579c] process > RunRarefaction (S3_test) [100%] 3 of 3 ✔ [f4/be4612] process > ExtractSNP (S3_test) [100%] 3 of 3 ✔ [6b/40027f] process > RunRGI (S3_test) [100%] 3 of 3 ✔ [7a/895c54] process > SNPconfirmation (S3_test) [100%] 3 of 3, failed: 3 ✔ [- ] process > Confirmed_AMR_hits - [2d/08d01d] process > Confirmed_ResistomeResults (null) [100%] 1 of 1, failed: 1 ✔ [2f/ac11f9] process > ExtractDedupSNP (S3_test) [100%] 3 of 3 ✔ [5f/00ffef] process > RunDedupRGI (S3_test) [100%] 3 of 3 ✔ [45/758437] process > DedupSNPconfirmation (S3_test) [100%] 3 of 3, failed: 3 ✔ [- ] process > ConfirmDedupAMRHits - [b8/cc98e6] process > DedupSNPConfirmed_ResistomeResults (null) [100%] 1 of 1, failed: 1 ✔ [45/758437] NOTE: Missing output file(s) S3_test_rgi_perfect_hits.csv expected by process DedupSNPconfirmation (S3_test) -- Error is ignored [b8/cc98e6] NOTE: Process DedupSNPConfirmed_ResistomeResults (null) terminated with an error exit status (2) -- Error is ignored [7a/895c54] NOTE: Missing output file(s) S3_test_rgi_perfect_hits.csv expected by process SNPconfirmation (S3_test) -- Error is ignored [2d/08d01d] NOTE: Process Confirmed_ResistomeResults (null) terminated with an error exit status (2) -- Error is ignored Completed at: 08-Sep-2020 09:56:06 Duration : 4m 42s CPU hours : 0.3 (0.4% failed) Succeeded : 42 Ignored : 8 Failed : 8

MatteoSchiavinato commented 4 years ago

I'm encountering the same issue (more or less), with RGI. It shows failed: 1 at that step, and if I re-execute the RGI command outside of nextflow:

rgi main --input_sequence /binfl/lv70694/schmat/chicken/amrPlusPlus/process/F21_E_R3/ExtractMegaresSNPs/SNP_fasta/F21_E_R3.snp.fasta --input_type contig --output_file test.output -a diamond --local --debug

I get the following error:

Traceback (most recent call last):
  File "/home/lv70694/schmat/.local/lib/python3.6/site-packages/app/RGI.py", line 286, in process_contig
    if os.stat(contig_fsa_file).st_size > 0:
FileNotFoundError: [Errno 2] No such file or directory: '/binfl/lv70694/schmat/chicken/amrPlusPlus/F21_E_R3.snp.fasta.temp.contig.fsa'
INFO 2020-10-11 19:08:01,689 : run filter
ERROR 2020-10-11 19:08:01,690 : missing blast xml file(). Please check if input_type: 'contig' correspond with input file: '/binfl/lv70694/schmat/chicken/amrPlusPlus/process/F21_E_R3/ExtractMegaresSNPs/SNP_fasta/F21_E_R3.snp.fasta' or use '--low_quality' flag for short contigs to predicts partial genes.

Any idea what could cause it? It seems connected to this issue so I'm writing it here.

MatteoSchiavinato commented 4 years ago

I solved mine. One of the RGI dependencies (prodigal) was not the right version. I know this was "on me" and not on the AMR++ tool, but since it relies on many tools each of which has dependencies to be met, it would be greatly useful if you guys made a little script that one can run upon installation to check if everything is there. Anyway, solved!

If anyone in the future will stumble upon this issue, before opening a new one check if all these dependencies and python modules are installed and of the right version: https://github.com/arpcard/rgi/#requirements

qducarmon commented 3 years ago

Hi,

I am experiencing the exact same issue as Andrew. Has a solution been found for this in the meanwhile, or @abissett, did you figure out how to fix this? The run also completes for me, but no hits were found under the Perfect model.

Best and thanks in advance, Quinten

EnriqueDoster commented 2 years ago

Hello to everyone on this older thread. We apologize for not responding earlier and we hope everything worked out okay. Please see the update on the READ.ME file where we announce an upcoming update.