JoseBlanca / franklin

franklin library for NGS sequencing analysis.
http://bioinf.comav.upv.es/franklin/
GNU Affero General Public License v3.0
25 stars 3 forks source link

Samtools fails if reference file has empty lines #54

Open necrolyte2 opened 11 years ago

necrolyte2 commented 11 years ago

Should be a simple fix to ensure that the reference fasta file does not contain empty lines. I'll likely fork the project soon anyways and will be happy to push the fix.

Trying to get a feel for the project right now. So far very awesome!

Traceback (most recent call last): File "/home/EIDRUdata/Tyghe/Dev/ngs_backbone/src/franklin/scripts/backbone/backbone_analysis.py", line 81, in main do_analysis(project_settings=settings_fpath, kind=action) File "/home/EIDRUdata/Tyghe/Dev/ngs_backbone/src/franklin/franklin/backbone/backbone_runner.py", line 124, in do_analysis analyzer.run() File "/home/EIDRUdata/Tyghe/Dev/ngs_backbone/src/franklin/franklin/backbone/mapping.py", line 286, in run tmp_dir=tmp_dir) File "/home/EIDRUdata/Tyghe/Dev/ngs_backbone/src/franklin/franklin/sam.py", line 375, in realign_bam create_sam_reference_index(reference_fpath) File "/home/EIDRUdata/Tyghe/Dev/ngs_backbone/src/franklin/franklin/sam.py", line 368, in create_sam_reference_index call(cmd, raise_on_error=True) File "/home/EIDRUdata/Tyghe/Dev/ngs_backbone/src/franklin/franklin/utils/cmd_utils.py", line 552, in call raise RuntimeError(msg) RuntimeError: Error running command: /home/EIDRUdata/Tyghe/Dev/ngs_backbone/src/franklin/ext/bin/linux/64bit/samtools faidx /home/EIDRUdata/Tyghe/sequence_data/tutorial_replicate/mapping/reference/reference.fasta stderr: [fai_build_core] inlined empty line is not allowed in sequence 'CY074919_NS_Managua09'.

JoseBlanca commented 11 years ago

Hi:

I'd rather fix the input file with grep than add this extra functionality, but that's just only my two cents. By the way it is great to have somebody interested in this code. We are not developing in franklin a lot this days, we're more active in seq_crumbs. seq_crumbs is a rewrite of the read cleaning code that is faster and better structured. If you're interested in read cleaning you should give it a look.