NKI-GCF / Metrix

A Server / Client parser for Illumina InterOp sequencer run metrics.
Other
10 stars 5 forks source link

Extraction metrics fail to parse #3

Open froggleston opened 9 years ago

froggleston commented 9 years ago

Hi Bernd,

I'm trying the new Metrix maven-dev code branch, and I'm hitting issues with processing InterOps that were previously fine:

[SEVERE] [Wed 07-01-2015 16:15:41] [nki.parsers.illumina.ExtractionMetrics] : Error in parsing version number and recordLength: java.io.EOFException

Does this have something to do with the updates for the HiSeq2500 that are not backwards compatible with previous InterOp formats?

Cheers

Rob

Rhizosis commented 9 years ago

Hi Rob,

That sounds strange. The only changes that were made to the QualityMetrics parser because of the QScore Binning feature.

I will look into it on friday when i get back. I havent ran into any problems with the extraction metrics.

If possible, would you be so kind to zip the run folder with the following:

-RunInfo.xml InterOp/

You can of course leave out the indexMetrics and the run folder name and any private contents in the runinfo.xml can be anonymized.

Ill run it on my end and get back to you.

Cheers, Bernd

On 7 Jan 2015, at 05:19 pm, "Robert Davey" notifications@github.com<mailto:notifications@github.com> wrote:

Hi Bernd,

I'm trying the new Metrix maven-dev code branch, and I'm hitting issues with processing InterOps that were previously fine:

[SEVERE] [Wed 07-01-2015 16:15:41] [nki.parsers.illumina.ExtractionMetrics] : Error in parsing version number and recordLength: java.io.EOFException

Does this have something to do with the updates for the HiSeq2500 that are not backwards compatible with previous InterOp formats?

Cheers

Rob

— Reply to this email directly or view it on GitHubhttps://github.com/NKI-GCF/Metrix/issues/3.

djcooke commented 9 years ago

Hi,

I have been working on a fork of Rob's work that uses Metrix. I had noticed this same error, but my solution for another problem resulted in working around this. Our problem was that our RunInfo.xml files are often gzipped, and I needed to be able to handle that.

Originally, the MetrixContainer(String runDir) constructor was used, and this looks for the usual RunInfo.xml. Instead, I created a Summary using the (possibly gzipped) RunInfo, and used the MetrixContainer(Summary summary, boolean remote) constructor. Using this constructor, extractionMetrics is properly parsed.

There were side-effects, however: The MetrixContainer(Summary, boolean) constructor does not set the Summary's currentCycle when parsing ExtractionMetricsOut.bin. This also results in errorMetrics not being parsed. I worked around this by setting the currentCycle before constructing the MetrixContainer.

One other difference I noticed before and after these changes was in the raw combined read quality scores taken from QMetricsOut.bin. The values I'm getting now are equal to the sums of lane quality scores. Originally, the values were exactly double what I am getting now, which I believe was incorrect.

I hope this information is useful!

Regards, Dillan

froggleston commented 9 years ago

Hi Dillan,

Could you raise a pull request for your changes? They sound really useful!

Cheers

Rob

djcooke commented 9 years ago

Sure, of course Rob! Moving conversation to TGAC's JIRA