hartleys / QoRTs

Quality of RNA-Seq Toolset
52 stars 14 forks source link

More informative debugging #1

Closed jdidion closed 9 years ago

jdidion commented 9 years ago

As one example, at line 562 in commonSeqUtils.scala, the error message does not give the names of the mismatched reads.

hartleys commented 9 years ago

Hmm. That particular example is a trivial addition, I can include it in the next release.

A full pass through all error handling code will of course be a larger undertaking. Many error cases already post a fair amount of information, but there's still a number of cases that don't post much at all.

Any other specific examples you have encountered?

jdidion commented 9 years ago

That's just the one that came up today. Honestly it's the first error I've encountered so perhaps the title of my bug report was too general.

On Apr 9, 2015, at 7:20 PM, hartleys notifications@github.com wrote:

Hmm. That particular example is a trivial addition, I can include it in the next release.

A full pass through all error handling code will of course be a larger undertaking. Many error cases already post a fair amount of information, but there's still a number of cases that don't post much at all.

Any other specific examples you have encountered?

— Reply to this email directly or view it on GitHub.

hartleys commented 9 years ago

Do you know what caused the problem?

The obvious cause would be if you passed an unsorted or position-sorted bam file without using the "--coordSorted" option. The new error message will definitely mention this.

Or was it something else? If so I'd like to make sure I mention that other possible cause in the error message.

jdidion commented 9 years ago

I was using a publicly available BAM file from SRA. It had been filtered to remove unmapped reads, but the flags had not been updated so there were orphaned mates left in the BAM file that still had 0x02 set. I am running picard FixMateInformation now; hopefully that will resolve the problem. Once I modified the error message to include the specific read that was causing the problem, it was easy to debug.

On Apr 9, 2015, at 10:02 PM, hartleys notifications@github.com wrote:

What caused the problem?

The obvious cause would be if you passed an unsorted or position-sorted bam file without using the "--coordSorted" option. The new error will definitely mention this.

Or was it something else? If so I'd like to make sure I mention that other possible cause in the error message.

— Reply to this email directly or view it on GitHub.

hartleys commented 9 years ago

Ah, I see.

The "--coordSorted" mode runs through the reads and keeps a running reference-map of unpaired reads. Technically speaking, it will actually run on any bam file regardless of sorting and regardless of the presence of orphaned reads (it should print a warning at the end if there are orphaned reads left over, but will otherwise ignore them). The memory usage might get unacceptable if there are too many orphaned reads or if paired reads tend to be too far apart in the file.

jdidion commented 9 years ago

I think I ran out of memory (I assume, the process exited without any error, but SGE often kills a job that exceeds its memory limit without giving the process a chance to exit gracefully), so I tried to name-sort the file and then got the error that I reported

On Apr 10, 2015, at 12:48 AM, hartleys notifications@github.com wrote:

Ah, I see.

The "--coordSorted" mode runs through the reads keeps a running reference-map of unpaired reads. Technically speaking, it will actually run on any bam file regardless of sorting and regardless of the presence of orphaned reads (it should print a warning at the end if there are orphaned reads left over, but will otherwise ignore them). The memory usage might get unacceptable if there are too many orphaned reads or if paired reads tend to be too far apart in the file.

— Reply to this email directly or view it on GitHub.

hartleys commented 9 years ago

It's best to run java with the -Xmx option to prevent this problem. So the syntax would be something like: java -Xmx20g -jar /path/to/jarfile/QoRTs.jar QC etc etc etc

I assume you're also setting a memory usage ceiling in SGE using h_vmem? If so, make sure your java parameter is a bit smaller than that. This way SGE won't just kill the job without warning. Instead QoRTs will hit an out-of-memory exception and die (relatively) gracefully.

You can technically also run QoRTs in "--coordSorted" mode on name-sorted datasets. This should work around the issue of orphaned reads. If there are too many it might still have memory issues.

jdidion commented 9 years ago

Changing the 0x02 flag to unset for orphan reads didn’t work, which in retrospect makes sense because you won’t know in advance if the unmapped reads were retained (unless you provide a command-line option).

I did finally get this to work by running in —coordSorted mode. I am curious as to why I was running out of memory when the file was coordinate sorted but it works when name-sorted.

On Apr 10, 2015, at 1:40 PM, hartleys notifications@github.com wrote:

It's best to run java with the -Xmx option to prevent this problem. So the syntax would be something like: java -Xmx20g -jar /path/to/jarfile/QoRTs.jar QC etc etc etc

I assume you're also setting a memory usage ceiling in SGE using h_vmem? If so, make sure your java parameter is a bit smaller than that. This way SGE won't just kill the job without warning. Instead QoRTs will hit an out-of-memory exception and die (relatively) gracefully.

You can technically also run QoRTs in "--coordSorted" mode on name-sorted datasets. This should work around the issue of orphaned reads. If there are too many it might still have memory issues.

— Reply to this email directly or view it on GitHub.

hartleys commented 9 years ago

If a large proportion of the reads are very far apart then the memory use will be higher than if it is name-sorted. When its properly name sorted most reads only get loaded into the reference map for 1 line. Hence the lower memory usage.

You say you fixed the 0x02 flag. But what about the "mate unmapped" flag, 0x04? According to the specification that's supposed to be the reliable way to tell if the mate should be expected, so that's what QoRTs uses.

jdidion commented 9 years ago

That makes sense. I didn't read the docs closely enough to catch that. I'll modify my script and make it available on trek for anyone else that runs into the same issue.

On Apr 12, 2015, at 10:33 AM, hartleys notifications@github.com wrote:

If a large proportion of the reads are very far apart then the memory use will be higher than if it is name-sorted. When its properly name sorted most reads only get loaded into the reference map for 1 line. Hence the lower memory usage.

You say you fixed the 0x02 flag. But what about the "mate unmapped" flag, 0x04? According to the specification that's supposed to be the reliable way to tell if the mate should be expected, so that's what QoRTs uses. — Reply to this email directly or view it on GitHub.

jdidion commented 9 years ago

Btw, I think you meant 0x08, correct? or am I misreading the spec?

On Apr 12, 2015, at 10:33 AM, hartleys notifications@github.com wrote:

If a large proportion of the reads are very far apart then the memory use will be higher than if it is name-sorted. When its properly name sorted most reads only get loaded into the reference map for 1 line. Hence the lower memory usage.

You say you fixed the 0x02 flag. But what about the "mate unmapped" flag, 0x04? According to the specification that's supposed to be the reliable way to tell if the mate should be expected, so that's what QoRTs uses. — Reply to this email directly or view it on GitHub https://github.com/hartleys/QoRTs/issues/1#issuecomment-92071027.

hartleys commented 9 years ago

Yes. That one.

Typo. On Apr 13, 2015 9:30 AM, "John Didion" notifications@github.com wrote:

Btw, I think you meant 0x08, correct? or am I misreading the spec?

On Apr 12, 2015, at 10:33 AM, hartleys notifications@github.com wrote:

If a large proportion of the reads are very far apart then the memory use will be higher than if it is name-sorted. When its properly name sorted most reads only get loaded into the reference map for 1 line. Hence the lower memory usage.

You say you fixed the 0x02 flag. But what about the "mate unmapped" flag, 0x04? According to the specification that's supposed to be the reliable way to tell if the mate should be expected, so that's what QoRTs uses. — Reply to this email directly or view it on GitHub < https://github.com/hartleys/QoRTs/issues/1#issuecomment-92071027>.

— Reply to this email directly or view it on GitHub https://github.com/hartleys/QoRTs/issues/1#issuecomment-92351788.

hartleys commented 9 years ago

The newest release (v0.2.17) now includes more informative debugging at the location you mentioned. In addition, the name-sorted mode is no longer the default. The default mode is now the "coordSorted" mode, which (as mentioned in this thread) actually accepts both coordinate- or name-sorted bam files. Thus, this change should not break compatibility with old scripts. The old default name-sorted mode can be activated using the "--nameSorted" option.

For backwards-compatibility the "--coordSorted" option is still legal, but is nonfunctional.

jdidion commented 9 years ago

Great, thanks!

On Apr 15, 2015, at 2:39 PM, hartleys notifications@github.com wrote:

The newest release (v0.2.17) now includes more informative debugging at the location you mentioned. In addition, the name-sorted mode is no longer the default. The default mode is now the "coordSorted" mode, which (as mentioned in this thread) actually accepts both coordinate- or name-sorted bam files. Thus, this change should not break compatibility with old scripts. The old default name-sorted mode can be activated using the "--nameSorted" option.

For backwards-compatibility the "--coordSorted" option is still legal, but is nonfunctional.

— Reply to this email directly or view it on GitHub.