heathsc / gemBS

gemBS is a bioinformatics pipeline designed for high throughput analysis of DNA methylation from Whole Genome Bisulfite Sequencing data (WGBS).
GNU General Public License v3.0
32 stars 21 forks source link

error running gemBS call #58

Open lrsantiago opened 5 years ago

lrsantiago commented 5 years ago

Hello Dr. Heath, I ran gemBS map and call in one sample (WGBS) and everything went well. Then, I added two other samples to my metadata.csv, deleted the .gemBS folder and reran gemBS prepare and index. Then I ran gemBS map in the 2 samples that I just added, same .config file. GemBS map went well, but when I ran gemBS call, I got the following error:

----------- Methylation Calling -------- : Reference : /store/LevelOne/TAPG_WGBS/GemBS_analysis/reference/GCF_000001405.39_GRCh38.p13_genomic.fna : Species : Homo sapiens : Right Trim : 0 : Left Trim : 5 : Chromosomes : ['NC_000001.11', 'NC_000002.12', 'NC_000003.12', 'NC_000004.12', 'NC_000005.10', 'NC_000006.12', 'NC_000007.14', 'NC_000023.11', 'NC_000008.11', 'NC_000009.12', 'NC_000011.10', 'NC_000010.11', 'NC_000012.12', 'NC_000013.11', 'NC_000014.9', 'NC_000015.10', 'NC_000016.10', 'NC_000017.11', 'NC_000018.10', 'NC_000020.11', 'NC_000019.10', 'NC_000024.10', 'NC_000022.11', 'NC_000021.9', '@pool_1', '@pool_2', '@pool_3', '@pool_4', '@pool_5', '@pool_6', '@pool_7', '@pool_8'] : Threads : 15 : dbSNP File : /store/LevelOne/TAPG_WGBS/GemBS_analysis/dbSNP/dbSNP_gemBS.idx : Sample: S2 Bam: /store/LevelOne/TAPG_WGBS/mapping/S2/S2.bam : Sample: S17 Bam: /store/LevelOne/TAPG_WGBS/mapping/S17/S17.bam : : Methylation Calling... 2019-08-19 13:51:55,723 ERROR: Process '/usr/lib/python3.6/site-packages/gemBS/bin/bcftools' finished with 255 2019-08-19 13:51:55,749 ERROR: The chromosome block NT_187502.1 is not sorted, consider running with -a. Exception in thread Thread-1: Traceback (most recent call last): ValueError: Error while concatenating bcf calls.

I don't know what could be causing this since it worked perfectly in my previous analysis with one sample from the same database.

YingYa commented 5 years ago

Hello lrsantiago,

How much time the gemBS call takes to run one sample (WGBS) on GRCh38? I have been ran more than 100 hours without any output.

heathsc commented 5 years ago

Hi,

It shouldn’t take this long - at what stage in the pipeline are you ?

Simon

On 3 Sep 2019, at 04:21, YingYa notifications@github.com wrote:

Hello lrsantiago,

How much time the gemBS call takes to run one sample (WGBS) on GRCh38? I have been ran more than 100 hours without any output.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/heathsc/gemBS/issues/58?email_source=notifications&email_token=AAY465ZVJIXCU42GJIXZN7LQHXCY5A5CNFSM4INCZEK2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5W2FXY#issuecomment-527278815, or mute the thread https://github.com/notifications/unsubscribe-auth/AAY465YHRISER2MXXFXTBJTQHXCY5ANCNFSM4INCZEKQ.

lrsantiago commented 5 years ago

Hi YingYa, For me it takes only a couple of hours. For one sample less than 5h for sure. I just figure that the error that I was getting is something to do with this particular sample, because I'm not getting the same error for the other samples. Simon, I am wondering if it is possible to run gemBS call without running gemBS map. I have some bam alignments from other tool, and I would like to use gemBS to do the variant calling. I have tried a couple of things, but I get the message, "bam file is not ready to use". Do you know if this is possible and how should I proceed to do it? thanks

YingYa commented 5 years ago

Hi, It shouldn’t take this long - at what stage in the pipeline are you ? Simon On 3 Sep 2019, at 04:21, YingYa @.***> wrote: Hello lrsantiago, How much time the gemBS call takes to run one sample (WGBS) on GRCh38? I have been ran more than 100 hours without any output. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#58?email_source=notifications&email_token=AAY465ZVJIXCU42GJIXZN7LQHXCY5A5CNFSM4INCZEK2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5W2FXY#issuecomment-527278815>, or mute the thread https://github.com/notifications/unsubscribe-auth/AAY465YHRISER2MXXFXTBJTQHXCY5ANCNFSM4INCZEKQ.

Hi Simon, I could run the whole gemBS pipeline on example data successfully. But some issues on my own data.

Here is the stage of my pipeline: 2019-08-29 18:29:50,174 DEBUG: Setting process input to parent output 2019-08-29 18:29:50,175 DEBUG: Setting process log file to gemBS/calls/bs_call_Mbias-4_chrX.err 2019-08-29 18:29:50,176 DEBUG: Setting process input to parent output 2019-08-29 18:29:50,176 DEBUG: Setting process input to parent output 2019-08-29 18:29:50,176 DEBUG: Setting process input to parent output 2019-08-29 18:29:50,176 DEBUG: Setting process input to parent output 2019-08-29 18:29:50,177 DEBUG: Starting subprocess 2019-08-29 18:29:50,179 DEBUG: Starting subprocess 2019-08-29 18:29:50,180 DEBUG: Starting subprocess 2019-08-29 18:29:50,180 DEBUG: Setting process input to parent output 2019-08-29 18:29:50,180 DEBUG: Starting subprocess 2019-08-29 18:29:50,180 DEBUG: Starting subprocess 2019-08-29 18:29:50,187 DEBUG: Starting subprocess

heathsc commented 5 years ago

Hi Leandro,

What aligner are you using? bs_call should be able to work with BAMs from novoalign, bwameth, bismark, bsmap and gem3 (with the appropriate flags for BS processing where required). After that the tricky part is making sure gemBS knows where to find the files. If the BAMs (and indexes) are where gemBS is expecting then it should be sufficient to use the db-sync sub-command to enable gemBS to use the files.

Simon

On 3 Sep 2019, at 19:37, Leandro Rodrigues Santiago notifications@github.com wrote:

Hi YingYa, For me it takes only a couple of hours. For one sample less than 5h for sure. I just figure that the error that I was getting is something to do with this particular sample, because I'm not getting the same error for the other samples. Simon, I am wondering if it is possible to run gemBS call without running gemBS map. I have some bam alignments from other tool, and I would like to use gemBS to do the variant calling. I have tried a couple of things, but I get the message, "bam file is not ready to use". Do you know if this is possible and how should I proceed to do it? thanks

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/heathsc/gemBS/issues/58?email_source=notifications&email_token=AAY4655JEW6VNP2ZZOI5QEDQH2OGNA5CNFSM4INCZEK2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5Y7KVI#issuecomment-527562069, or mute the thread https://github.com/notifications/unsubscribe-auth/AAY4654RHHYQCJPLNYD5RMTQH2OGNANCNFSM4INCZEKQ.

YingYa commented 5 years ago

Hi Simon,

I did bs_call by hand but nothing output . samtools view -L contigs_Mbias-4_chr5.bed -h Mbias-4.bam | bs_call -r Homo_sapiens_assembly38.fasta -n Mbias-4 --contig-bed contigs_Mbias-4_chr5.bed --report-file Mbias-4_chr5.json --right-trim 0 --left-trim 5 --conversion 0.01,0.05 --reference-bias 2 --mapq-threshold 10 --bq-threshold 13 -o Mbias-4_chr5

Thanks

lrsantiago commented 5 years ago

Thanks, Simon. I'm using bismark. The thing is that I forgot to use the gemBS db-sync command. ​------ Leandro Rodrigues Santiago, PhD Postdoctoral Research Assistant The William Harvey Heart Centre Wolfson Institute of Preventive Medicine Queen Mary University of London Charterhouse Square, London, EC1M 6BQ, UK Tel: (+44) 7440 600 490 [cid:0909b136-d665-4649-9293-24751d76620f]


From: Simon Heath notifications@github.com Sent: 04 September 2019 06:40 To: heathsc/gemBS gemBS@noreply.github.com Cc: Leandro Rodrigues Santiago l.santiago@qmul.ac.uk; Author author@noreply.github.com Subject: Re: [heathsc/gemBS] error running gemBS call (#58)

Hi Leandro,

What aligner are you using? bs_call should be able to work with BAMs from novoalign, bwameth, bismark, bsmap and gem3 (with the appropriate flags for BS processing where required). After that the tricky part is making sure gemBS knows where to find the files. If the BAMs (and indexes) are where gemBS is expecting then it should be sufficient to use the db-sync sub-command to enable gemBS to use the files.

Simon

On 3 Sep 2019, at 19:37, Leandro Rodrigues Santiago notifications@github.com wrote:

Hi YingYa, For me it takes only a couple of hours. For one sample less than 5h for sure. I just figure that the error that I was getting is something to do with this particular sample, because I'm not getting the same error for the other samples. Simon, I am wondering if it is possible to run gemBS call without running gemBS map. I have some bam alignments from other tool, and I would like to use gemBS to do the variant calling. I have tried a couple of things, but I get the message, "bam file is not ready to use". Do you know if this is possible and how should I proceed to do it? thanks

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/heathsc/gemBS/issues/58?email_source=notifications&email_token=AAY4655JEW6VNP2ZZOI5QEDQH2OGNA5CNFSM4INCZEK2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5Y7KVI#issuecomment-527562069, or mute the thread https://github.com/notifications/unsubscribe-auth/AAY4654RHHYQCJPLNYD5RMTQH2OGNANCNFSM4INCZEKQ.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fheathsc%2FgemBS%2Fissues%2F58%3Femail_source%3Dnotifications%26email_token%3DACDJ24PKN3RGCJ5BBWALBJ3QH5C37A5CNFSM4INCZEK2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD52M7NQ%23issuecomment-527749046&data=02%7C01%7Cl.santiago%40qmul.ac.uk%7C5b6e6b32af8245b55e8d08d730fa5e0b%7C569df091b01340e386eebd9cb9e25814%7C0%7C0%7C637031724183522913&sdata=fR2IRWyLqbeONrS1bLYR6A%2F2PAnxghzlmliUMwYcblM%3D&reserved=0, or mute the threadhttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FACDJ24K7GAWIMOPU5KF76LLQH5C37ANCNFSM4INCZEKQ&data=02%7C01%7Cl.santiago%40qmul.ac.uk%7C5b6e6b32af8245b55e8d08d730fa5e0b%7C569df091b01340e386eebd9cb9e25814%7C0%7C0%7C637031724183532905&sdata=Co4qkm1S6V0M1ELdfqFuj7y7anozINfICyX2csRw3lY%3D&reserved=0.

YingYa commented 5 years ago

I could run the gemBS call with hg19 successfully, but hampered with hg38. I think it may causes by the number of contigs in refernece, hg19 is 93, hg38 is 3366.

heathsc commented 5 years ago

Is there any chance that you could provide me with the input files you used when you tried to run bs_call by hand with hg38? How much memory do you have available for the calling process?

Simon

On 6 Sep 2019, at 10:57, YingYa notifications@github.com wrote:

I could run the gemBS call with hg19 successfully, but hampered with hg38. I think it may causes by the number of contigs in refernece, hg19 is 93, hg38 is 3366.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/heathsc/gemBS/issues/58?email_source=notifications&email_token=AAY465ZLZMA7ZHTK2ZWQ6MTQIILODA5CNFSM4INCZEK2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6CG7QY#issuecomment-528773059, or mute the thread https://github.com/notifications/unsubscribe-auth/AAY465ZI6GUODWVRCZRYL23QIILODANCNFSM4INCZEKQ.