heathsc / gemBS

gemBS is a bioinformatics pipeline designed for high throughput analysis of DNA methylation from Whole Genome Bisulfite Sequencing data (WGBS).
GNU General Public License v3.0
32 stars 21 forks source link

Error while executing the Bisulfite bisulphite-mapping #46

Closed IsmailM closed 5 years ago

IsmailM commented 5 years ago

I'm analysing single-ended reads and I get the following error:

Level 30:
Level 30: Command map started at 2019-03-14 14:21:38.988822
Level 30:
Level 30: ------------ Mapping Parameters ------------
Level 30: Sample barcode   : PCa_45_01b_S19
Level 30: Data set         : PCa_45_01b_S19
Level 30: No. threads      : 70
Level 30: Index            : /home/ucbtmog/d/nugen/t/ref_indexes/hg37.BS.gem
Level 30: Paired           : False
Level 30: Read non stranded: False
Level 30: Type             : SINGLE
Level 30: Input Files      : PCa_45_01b_S19_R1.fastq.gz
Level 30: Output dir       : /home/ucbtmog/d/nugen/t/mapping/PCa_45_01b_S19
Level 30:
Level 30: Bisulfite Mapping...
2019-03-14 14:21:39,017 ERROR: Process '/home/ucbtmog/.pyenv/versions/3.5.3/lib/python3.5/site-packages/gemBS/gemBSbinaries/gem-mapper' finished with 1
Traceback (most recent call last):
  File "/home/ucbtmog/.pyenv/versions/3.5.3/bin/gemBS", line 13, in <module>
    load_entry_point('gemBS==3.2.6', 'console_scripts', 'gemBS')()
  File "/home/ucbtmog/.pyenv/versions/3.5.3/lib/python3.5/site-packages/gemBS/commands.py", line 157, in gemBS_main
    instances[args.command].run(args)
  File "/home/ucbtmog/.pyenv/versions/3.5.3/lib/python3.5/site-packages/gemBS/production.py", line 367, in run
    self.do_mapping(fl)
  File "/home/ucbtmog/.pyenv/versions/3.5.3/lib/python3.5/site-packages/gemBS/production.py", line 559, in do_mapping
    under_conversion=self.underconversion_sequence,over_conversion=self.overconversion_sequence)
  File "/home/ucbtmog/.pyenv/versions/3.5.3/lib/python3.5/site-packages/gemBS/__init__.py", line 739, in mapping
    raise ValueError("Error while executing the Bisulfite bisulphite-mapping")
ValueError: Error while executing the Bisulfite bisulphite-mapping

The error logs for the mapping are empty.

IsmailM commented 5 years ago

Okay found a fix:

After adding a log-line to see the mapping process, I see that it attempts to run:

/home/ucbtmog/.pyenv/versions/3.5.3/lib/python3.5/site-packages/gemBS/gemBSbinaries/gem-mapper -I /home/ucbtmog/d/nugen/t/ref_indexes/hg37.BS.gem -i PCa_45_01b_S19_R1.fastq.gz -t 70 --report-file /home/ucbtmog/d/nugen/t/mapping/PCa_45_01b_S19/PCa_45_01b_S19.json -r @RG\\tID:PCa_45_01b_S19\\tSM:PCa_45_01b_S19\\tBC:PCa_45_01b_S19\\tPU:PCa_45_01b_S19

This gives me the following error message:

GEM::FatalError (fm.c:193,fm_open_file)
 Could not open file 'PCa_45_01b_S19_R1.fastq.gz'

Would be nice if this was reported in the error logs or in STDOUT

So the issue is that the full path to the input file is not passed to the tool.

Here is a fix:

https://github.com/heathsc/gemBS/blob/fca06d59931a6639fee89839009847d84a08824a/gemBS/production.py#L480-L503

Specifically Line 490 should be changed to:

if not skip: inputFiles.append(os.path.join(input_dir,file))

This is similar to what is done with paired-end reads - see lines 499 & 451 above

heathsc commented 5 years ago

Thanks for the feedback. I will make the suggested changes.

Simon

On 14 Mar 2019, at 15:34, Ismail Moghul notifications@github.com wrote:

Okay found a fix:

After adding a log-line to see the mapping process, I see that it attempts to run:

/home/ucbtmog/.pyenv/versions/3.5.3/lib/python3.5/site-packages/gemBS/gemBSbinaries/gem-mapper -I /home/ucbtmog/d/nugen/t/ref_indexes/hg37.BS.gem -i PCa_45_01b_S19_R1.fastq.gz -t 70 --report-file /home/ucbtmog/d/nugen/t/mapping/PCa_45_01b_S19/PCa_45_01b_S19.json -r @RG\tID:PCa_45_01b_S19\tSM:PCa_45_01b_S19\tBC:PCa_45_01b_S19\tPU:PCa_45_01b_S19 This gives me the following error message:

GEM::FatalError (fm.c:193,fm_open_file) Could not open file 'PCa_45_01b_S19_R1.fastq.gz' Would be nice if this was reported in the error logs or in STDOUT

So the issue is that the full path to the input file is not passed to the tool.

Here is a fix:

https://github.com/heathsc/gemBS/blob/fca06d59931a6639fee89839009847d84a08824a/gemBS/production.py#L480-L503 https://github.com/heathsc/gemBS/blob/fca06d59931a6639fee89839009847d84a08824a/gemBS/production.py#L480-L503 Specifically Line 490 should be changed to:

if not skip: inputFiles.append(os.path.join(input_dir,file)) This is similar to what is done with paired-end reads - see lines 499 & 451 above

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/heathsc/gemBS/issues/46#issuecomment-472888022, or mute the thread https://github.com/notifications/unsubscribe-auth/ADHPd8ufkjWfVfcgNC-w_S9xnBd05dDzks5vWl3kgaJpZM4b0SJt.

heathsc commented 5 years ago

Fix applied.