JaneliaSciComp / msg

Multiplexed Shotgun Genotyping
http://genomics.princeton.edu/AndolfattoLab/MSG.html
11 stars 12 forks source link

Fix Stampy Issue: stampy: Error: invalid data stream - and remove ln -s workaround #28

Open gregpinero opened 12 years ago

gregpinero commented 12 years ago

Stampy seems to only map the individual fq files if they have an .fq extension. Investigate and possibly fix what is going wrong in stampy. If fixed, remove workaround of creating symbolic links with the fq extension.

This isn't a priority since the workaround is working, but it might be worth doing to A.) improve Stampy, and B.) Make sure there's nothing wrong with our FQ files.

Background:

Hi,

The line that causes the error simply says "line = self.infile.readline()", so you must be pointing Stampy to an invalid file, somehow. It could be compressed perhaps; it's not likely a Stampy problem.

Renaming to .fq might be your best bet - although I don't understand what's going on.

Best wishes Gerton.

On 16 Feb 2012, at 18:48, Pinero, Gregory wrote:

Hi There,

I'm trying to run stampy on this input file (it's inside the tar.gz file I linked to below.)

It seems to work if I rename the input file to end with .fq but not otherwise, even though I'm specifying the input format.

I appreciate if you could take a look. Let me know if there's anything I can do to help.

I believe I've included all of the necessary files:

stampy_data.tar.gz

Thanks,

Greg Pinero

Here is the command I ran and the output and error I got:

[login2 - pinerog@e02u19]~/msg_work/MSG_toy>stampy.py -v3 --bwaoptions="-q10 parent1_ref.fa" -g parent1_ref.fa.stampy.msg -h parent1_ref.fa.stampy.msg --inputformat=fastq -M test_in_indivA12_AATAAG -o test_out.sam stampy: Starting Stampy with the following options: genome=parent1_ref.fa.stampy.msg logfile=stderr hash=parent1_ref.fa.stampy.msg outputformat=sam output=test_out.sam inputformat=fastq stats=mapstats.cache qualitybase=! recaldatasuffix=.recaldata bwaoptions=-q10 parent1_ref.fa bwa=bwa verbosity=3 bits=-1 maxcount=200 maxscore=99999 minposterior=-99999 numrecords=-1 lowqthreshold=10 seed=1 insertsize=250 insertsd=60 maxfingerprintvariants=3 linearalignmentband=3 simulate-minindellen=0 simulate-maxindellen=0 tryvariants=-1 fastaqual=30 simulate-numsubstitutions=0 gapopen=40 gapextend=3 recalscoreprefix=20 svprior=55 longindelprior=40 baseentropy=5 banding=60 xa-max=0 xa-max-discordant=0 insertsize2=-2000 insertsd2=-1 padding=160 maxpairseeds=25 paircandlikethres=100 bwamaxmismatch=-1 bwabatchsize=50000 recalfraction=0.01 substitutionrate=0.001 stampy: Opening genome file parent1_ref.fa.stampy.msg.stidx stampy: Opening hash file parent1_ref.fa.stampy.msg.sthash stampy: Using BWAVersion: 0.5.7 (r1310) for pre-mapping stampy: Mapping... stampy: Traceback: File "/usr/local/msg/bin/stampy/stampy.py", line 701, in main() File "/usr/local/msg/bin/stampy/stampy.py", line 669, in main mapreads( settings, logger, actiondict['-M'], arguments ) File "/usr/local/msg/bin/stampy/stampy.py", line 474, in mapreads for output in formatgenerator: pass File "/Net/fs1/home/gerton/Progs/Mapper/stampy/Stampy/formatter.py", line 115, in formatter File "/Net/fs1/home/gerton/Progs/Mapper/stampy/plugins/bwa.py", line 147, in generator File "/Net/fs1/home/gerton/Progs/Mapper/stampy/Stampy/reader.py", line 138, in generator

stampy: Error: invalid data stream