JaneliaSciComp / msg

Multiplexed Shotgun Genotyping
http://genomics.princeton.edu/AndolfattoLab/MSG.html
11 stars 12 forks source link

/ Character in FASTA sequence identifiers causes MSG to think references are directories #36

Open gregpinero opened 11 years ago

gregpinero commented 11 years ago

The problem is happening in extract-ref-alleles.py (could be in other places too).

Problem is in this line:

            alleles_outfile = open(os.path.join(outdir, ref + '-ref.alleles'), 'a')

and this line:

            orths_outfile = open(os.path.join(outdir, ref + '-orths.alleles'), 'a')

in the function "store_and_remove_alleles_orths".

Example: There was this string ">D.sim;ChrU;removedfrom2RbyDLSon1/4/09" in the file dsim-all-chromosome-r1.3.reassembly1.updated_wsu1.fasta.

dstern commented 11 years ago

Is this causing a crash? Perhaps easiest fix is to replace all forward slashes and all blank spaces in fata headers with an underline (_)

gregpinero commented 11 years ago

Indeed that's actually what I did.

I'll have to fix it in the code when I get a chance though.

Greg

Sent from my iPhone

On Oct 4, 2012, at 4:31 PM, dstern notifications@github.com wrote:

Is this causing a crash? Perhaps easiest fix is to replace all forward slashes and all blank spaces in fata headers with an underline (_)

— Reply to this email directly or view it on GitHub.