10XGenomics / vartrix

Single-Cell Genotyping Tool
MIT License
199 stars 26 forks source link

crash on empty vcf input #22

Closed wheaton5 closed 5 years ago

wheaton5 commented 6 years ago

Hey Ian, Pat, and whoever else is looking at this. I'm using vartrix and haven't had a problem until I tried it on some human data and I'm splitting on chromosome regions without chunking small chromosomes (such as alt haplotype chroms) together. So I first call variants in those regions with freebayes --pooled-continuous to get candidate variant sites. So many of these chunks have no candidate variants but proper headers. I will get around this by just not calling vartrix on these, but I think for robustness it would be good to not crash on this.

one of my tomls is as follows

name = 'vartrix' operating_system = 'unix:Ubuntu' crate_version = '1.1.0' explanation = ''' Panic occurred in file 'libcore/slice/mod.rs' at line 690 ''' method = 'Panic' backtrace = ''' stack backtrace: 0: 0x2b90f78b8f5c - backtrace::backtrace::libunwind::trace::habb00f14bd19719d at /mnt/home/ian.fiddes/.cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.9/src/backtrace/libunwind.rs:53

ifiddes-10x-zz commented 6 years ago

Good catch, I will fix this ASAP.

ifiddes-10x-zz commented 6 years ago

Would it be more desirable for VarTrix to crash with a human-understandable error at this point, write this to ERROR level log but still return a 0, or write to log and still produce empty matrix files?

For your use case, where I assume that you are automatically concatenating the resulting matrices, it seems like last option might be the best?

wheaton5 commented 6 years ago

Interesting. I actually think that a human understandable error and returning 0 is preferable even though it would have made my life easier for it to have the empty matrix file. From the rust philosophy of frontloading errors it will prevent more issues downstream to outright error on "bad" input than to silently do the right thing with "bad" input. But it is debatable and you should make the call.

ifiddes-10x-zz commented 6 years ago

I agree. It is also much easier to implement the 'kill and report the error' version :)

pmarks commented 6 years ago

Hi Haynes, thanks for the report. I agree it's debatable... my initial reaction in cases like this is to try to make a null matrix. It's conceptually simpler, because it's the same 'type' of output, rather than a special case that downstream user needs to think about. Ian, does it seem like it will be hard to make an empty matrix? Does the mex format support it?

On Tue, Nov 20, 2018 at 10:59 AM ifiddes-10x notifications@github.com wrote:

I agree. It is also much easier to implement the 'kill and report the error' version :)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/10XGenomics/vartrix/issues/22#issuecomment-440391585, or mute the thread https://github.com/notifications/unsubscribe-auth/AACzA_paPVFd2Wa31ot_HfVFBueIlh1Wks5uxFEZgaJpZM4Yqwru .

ifiddes-10x-zz commented 6 years ago

It looks like the format supports it. It won't be too hard, just need to re-do some logic.

wheaton5 commented 6 years ago

Hey guys. getting a few more errors. Figured I would keep it on this issue because it says its the same line number. But the vcf isn't empty this time. So far I just have 2 chunks failing with 1-2 variants per file. Haven't looked at the rust myself maybe it doesn't find cells supporting according to its criteria and it doesnt like outputting empty matrices?

toml here name = 'vartrix' operating_system = 'unix:Ubuntu' crate_version = '1.1.0' explanation = ''' Panic occurred in file 'libcore/slice/mod.rs' at line 690 ''' method = 'Panic' backtrace = ''' stack backtrace: 0: 0x2b5817e80f5c - backtrace::backtrace::libunwind::trace::habb00f14bd19719d at /mnt/home/ian.fiddes/.cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.9/src/backtrace/libunwind.rs:53

ifiddes-10x-zz commented 6 years ago

Ah, it seems like it is the chunks method of Vec that is causing the problems. I need to make sure that the number of chunks is always non-zero. It can be zero when the number of threads exceeds the number of variants, or when the number of variants is 0. I just pushed a fix to the empty_vcf branch. If you are compiling it yourself, you can try that one out. Or wait for Travis to run then I can upload a new binary.

ifiddes-10x-zz commented 6 years ago

I have put up a pre-release binary for 1.1.1 that fixes this issue.