Closed mscook closed 11 years ago
Hi Seb,
I deleted the Library.txt file from the run. I re-ran explicitly with the insert size and s.d. calculated from newbler (10 kb, 2 kb). I then hit the critical issue (just posted). I'll re-run these jobs on a single node and provide you with the required data.
Cheers
Mitch
Hi Seb,
Everything seems fine when - 1) I explicitly pass the 454 insert size metrics from newbler 2) Multiple nodes can access the input files
What is the content of SeedLengthDistribution.txt ?
Your Library0.txt indicates short seeds.
Ping
Will close as WONTFIX if the stakeholder does not report back. We need the LibraryX.txt file to fix the unit test. Otherwise, WONTFIX.
Backgound
I believe this relates to: http://sourceforge.net/mailarchive/forum.php?thread_name=4F0F07C3.4070105%40ulaval.ca&forum_name=denovoassembler-users
454 PE-library information via newbler
pairDistanceRangeUsed = 5108..15324 computedPairDistanceAvg = 10216.7 computedPairDistanceDev = 2554.2
Protocol
1) Use convert-sff.sh XXXX.sff (gives XXXX.sff.OUT.fastq.Forward.fastq XXXX.sff.OUT.fastq.Reverse.fastq and XXXX.sff.OUT.fastq.Single.fastq
2) Feed in ill-pe (previously interleaved and f/r/s 454 reads. Ray command:
ill=XXXX_proc.fastq left=XXXX.sff.OUT.fastq.Forward.fastq right=XXXX.sff.OUT.fastq.Reverse.fastq single=XXXX.sff.OUT.fastq.Single.fastq
mpirun -np $NP Ray -p $left $right -i $ill -s $single -k 21 -o 21 -show-distance-summary
Output
Contigs.fasta CoverageDistributionAnalysis.txt CoverageDistribution.txt degreeDistribution.txt Library0.txt Library1.txt LibraryStatistics.txt NetworkTest.txt NumberOfSequences.txt Rank(0-15).RayContigPaths.txt RayCommand.txt RayVersion.txt SeedLengthDistribution.txt SequencePartition.txt
LibraryNumber: 0 InputFormat: TwoFiles,Paired DetectionType: Automatic File: XXXX.sff.OUT.fastq.Forward.fastq NumberOfSequences: 827693 File: XXXX.sff.OUT.fastq.Reverse.fastq NumberOfSequences: 827693 Distribution: 17/Library0.txt
LibraryNumber: 1 InputFormat: Interleaved,Paired DetectionType: Automatic File: XXXX_proc.fastq NumberOfSequences: 21689626 Distribution: 17/Library1.txt Peak 0 AverageOuterDistance: 214 StandardDeviation: 59
It appears that the auto detection of peaks has failed for the 454 data.
253 1 273 1 423 1 425 1 433 1 449 1 1131 1 1409 1 2146 1
More to come when jobs complete.