yatisht / usher

Ultrafast Sample Placement on Existing Trees
MIT License
120 stars 40 forks source link

matutils extract segfaults with -s, but not -K, if no samples in text file are on tree #365

Open aofarrel opened 4 months ago

aofarrel commented 4 months ago

matutils extract can extract samples as specified via a text file via -s or -K, and in either case the format of the text file is the same. However, there is inconsistent behavior when the text file only lists samples that are not on the tree. In the case of -s the program segfaults, but in the case of -K it will handle the error gracefully and output a "subtree" that isn't actually a subtree.

example

Zip file with this example: example_pb_and_txt.zip

The samples within 2_samples_invalid.txt are not on the tree.

SAMEA5626318_foo
SAMEA111556040_bar

If I run matUtils extract -i 10_sra_samples.pb -K 2_samples_invalid.txt:1 -o some.subtree.pb, the program will point out these samples are not on the tree, and it will output a "subtree" that isn't actually a subtree.

If I run matUtils extract -i 10_sra_samples.pb -s 2_samples_invalid.txt -o some.subtree.pb, the program will segfault.

expected behavior

One of the following:

yatisht commented 4 months ago

Thanks for reporting, @aofarrel.

@jmcbroome are you available to look into this?