ncezid-biome / HMAS-QC-Pipeline2

An analysis pipeline for initial quality control and denoising of highly-multiplexed amplicon sequencing (HMAS) data, now without using mothur (as was used in the previous version)
MIT License
7 stars 2 forks source link

Cannot run my dataset using the pipeline #4

Open newuser1971 opened 1 year ago

newuser1971 commented 1 year ago

I installed the pipeline on centos 7 cluster. I can run the pipeline using test_data successfully. When I tried to run my own dataset, the pipeline failed. The workflow terminated earlier.

image

jinfinance commented 1 year ago

Thank you for trying out our pipeline!

Based on the screenshot provided, it appears that you are running the pipeline with your own dataset, albeit with the test profile setting, which is fine. I noticed that each process is tagged with "3-AdVB3-27_no_human," indicating that you have already updated the 'params.reads' parameter to point to your data path, and there is only one fastq.gz file in that folder. However, it's important to confirm whether you have also created your own primer information file and updated the 'params.primer' parameter accordingly. If not, the pipeline will use the given '451_subset_primer_pairs' for the primer removal step.

Regarding the issue you mentioned, it would be helpful to know if any errors were thrown by Nextflow or if the pipeline terminated abruptly without any warning.

If you are willing to share your '3-AdVB3-27_no_human' reads file and your primer information file, we would be happy to test them on our end. This will assist us in further investigating the problem.

Thank you.

newuser1971 commented 1 year ago

Good morning Jin,

Please see attached fastq, log, config and primer files. If you need more info please let me know.

Thank you very much!

Haibin

From: Rong Jin @.> Sent: Thursday, June 22, 2023 8:56 AM To: ncezid-biome/HMAS-QC-Pipeline2 @.> Cc: Wang, Haibin (CDC/DDID/NCIRD/DVD) (CTR) @.>; Author @.> Subject: Re: [ncezid-biome/HMAS-QC-Pipeline2] Cannot run my dataset using the pipeline (Issue #4)

Thank you for trying out our pipeline!

Based on the screenshot provided, it appears that you are running the pipeline with your own dataset, albeit with the test profile setting, which is fine. I noticed that each process is tagged with "3-AdVB3-27_no_human," indicating that you have already updated the 'params.reads' parameter to point to your data path, and there is only one fastq.gz file in that folder. However, it's important to confirm whether you have also created your own primer information file and updated the 'params.primer' parameter accordingly. If not, the pipeline will use the given '451_subset_primer_pairs' for the primer removal step.

Regarding the issue you mentioned, it would be helpful to know if any errors were thrown by Nextflow or if the pipeline terminated abruptly without any warning.

If you are willing to share your '3-AdVB3-27_no_human' reads file and your primer information file, we would be happy to test them on our end. This will assist us in further investigating the problem.

Thank you.

— Reply to this email directly, view it on GitHubhttps://github.com/ncezid-biome/HMAS-QC-Pipeline2/issues/4#issuecomment-1602587423, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGIYDG3QK6UJSXGNRGBTSDTXMQ6ENANCNFSM6AAAAAAZPHSI44. You are receiving this because you authored the thread.Message ID: @.**@.>>

primer ACACTGACGACATGGTTCTACATNCTTTGACATNCGNGGNGTNCTNGA ATGTGGAANYMKGCNGTNGACAGAGACCAAGTCTCTGCTACCGTA CS1-1Rm_CS2-3L_3LD primer ACACTGACGACATGGTTCTACATNCTTTGACATNCGNGGNGTNCTNGA ATGTAYTAYAAYNSNACNGGNAAYATGGGAGACCAAGTCTCTGCTACCGTA CS1-1Rm_CS2-X3RL_new primer ACACTGACGACATGGTTCTACATATCARCCNGARCCKCAARTDGG ATGTAYTAYAAYNSNACNGGNAAYATGGGAGACCAAGTCTCTGCTACCGTA CS1-X1F_CS2-X3RL_new primer ACACTGACGACATGGTTCTACAATGGGNTGGGAYTATCCWAARTGTG ATRMAWTTMGGNCCACCNTGWTSAGACCAAGTCTCTGCTACCGTA CS1-F2_CS2-R2As_Bs primer ACACTGACGACATGGTTCTACACTTATGGGTTGGGATTANCCNAARTGYGA CCNCAYGARTTNTGTTCACAACATACAATGAGACCAAGTCTCTGCTACCGTA CS1-F18184_CS2-R18798 primer ACACTGACGACATGGTTCTACAATGGGNTGGGAYTATCCWAARTGTG GAYGATGGYGTNGTNTGYTATAATAAGACCAAGTCTCTGCTACCGTA CS1-F2_CS2-R3 primer ACACTGACGACATGGTTCTACAACNGGAGACAANACNAAATGGAATGA GGGNATGTTNAANATGCTGTCAACAGTAGACCAAGTCTCTGCTACCGTA CS1-Flu-PAN-F3_CS2-Flu-PAN-R4-2 primer ACACTGACGACATGGTTCTACACCAGTTGGAGGNAATGARAAGAANGC GGNGAYAAYACNAAATGGAATGAATGAGACCAAGTCTCTGCTACCGTA CS1-Flu-PAN-F2-2_CS2-Flu-PAN-R3-3 primer ACACTGACGACATGGTTCTACAGTTGCTTCAATGGTTCARGGNGAYAA GNAAYATHGGNGANCCNGTAACTTCAGCAGACCAAGTCTCTGCTACCGTA CS1-PAR-SUB-F2730_CS2-PAR-SUB-R32932 primer ACACTGACGACATGGTTCTACATCNTTCTTTAGAASNTTYGGNCAYCC TTYGCNAARATGACNTACAAAATGAGAGACCAAGTCTCTGCTACCGTA CS1-RES-MORF1190_CS2-RES-MORR1777 primer ACACTGACGACATGGTTCTACAACACTCTATGTNGGNGAYCCNTTYAAYCC GTNCARGGNGANAATCAAGCAATTGCAGACCAAGTCTCTGCTACCGTA CS1-PAR-RU-F2243_CS2-PAR-RU-R2441 primer ACACTGACGACATGGTTCTACAGTGTAGGTAGNATGTTYGCNATGCARCC GARGGNTGGTGYCAAAANTTGTGGGACAGACCAAGTCTCTGCTACCGTA CS1-PNE-SUB-F1873_CS2-PNE-SUB-R2334 primer ACACTGACGACATGGTTCTACAGCAACGCNGTNTACGGNTTYACNGG TCATMTACGGSGACACKGACTCCCAGACCAAGTCTCTGCTACCGTA CS1-VYGA-F1_CS2-Her-R1 primer ACACTGACGACATGGTTCTACAGTAACTCGGTGTACGGTKTNACNGG AGGTDATHTATGGWGATACGGATAGAGACCAAGTCTCTGCTACCGTA CS1-VYGA-F2_CS2-Her-R2

jinfinance commented 1 year ago

Thank you Haibin,

But I only saw the primer info attached to the end of your message. There are no attachment files.. Maybe you can push them to one of your github repos and forward me the link.

jinfinance commented 1 year ago

a few more thoughts. If it the pipeline did terminate without any warnings, it seems like either pair_merging process or quality_filtering process did not generate any outputs, so all subsequent processes just stopped (because there are no inputs).

Can you check that in the pipeline's output folder, there should be a '3-AdVB3-27_no_human' folder with a 'temp' sub-folder in there. And there should be some fastq files in the 'temp' folder.

One more thing, I actually don't know much about primers, but your primer length appears to be very long, 48 bps ?

newuser1971 commented 1 year ago

I checked the work/ directory in the first step "cutadapt" there are 28 fastq files whose size are all zero so I think it is not successful in the first step.

Yes the primer is long.

newuser1971 commented 1 year ago

I attached the files below: Thank you!

3-AdVB3-27_no_human_R1.fastq.gz 3-AdVB3-27_no_human_R2.fastq.gz Juno_Justin_Primer_Pairs.txt nextflow.log nextflow.config.txt

jinfinance commented 1 year ago

Sorry for the late response. I was on the road for a week. Thanks for all uploading all the files, I will try them out and let you know what I may find out.

newuser1971 commented 1 year ago

Hi Rong,

Do you have any update on this issue?

Thank you very much!

Haibin

From: Rong Jin @.> Sent: Sunday, July 2, 2023 9:57 PM To: ncezid-biome/HMAS-QC-Pipeline2 @.> Cc: Wang, Haibin (CDC/DDID/NCIRD/DVD) (CTR) @.>; Author @.> Subject: Re: [ncezid-biome/HMAS-QC-Pipeline2] Cannot run my dataset using the pipeline (Issue #4)

Sorry for the late response. I was on the road for a week. Thanks for all uploading all the files, I will try them out and let you know what I may find out.

— Reply to this email directly, view it on GitHubhttps://github.com/ncezid-biome/HMAS-QC-Pipeline2/issues/4#issuecomment-1617107902, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGIYDG53GAA5ZNWUMNJLUD3XOIRHLANCNFSM6AAAAAAZPHSI44. You are receiving this because you authored the thread.Message ID: @.**@.>>

newuser1971 commented 1 year ago

Do you guys have any update on this issue?