YosefLab / Cassiopeia

A Package for Cas9-Enabled Single Cell Lineage Tracing Tree Reconstruction
https://cassiopeia-lineage.readthedocs.io/en/latest/
MIT License
75 stars 24 forks source link

Segmentation Fault in error_correct_cellbcs_to_whitelist #198

Closed syntonym closed 1 year ago

syntonym commented 1 year ago

I have a dataset with scarring-arrays and am trying to follow the preprocessing user guide here. When correcting the Cell UMIs to the whitelist, the program crashes very early with a segmentation fault during the error_correct_cellbcs_to_whitelist function call.

Code:

 bam_fp = cas.pp.error_correct_cellbcs_to_whitelist(
     bam_fp,
     whitelist='data/3M-february-2018.txt',
     output_directory=output_dir,
     n_threads=n_threads,
 )

stderr:

[3/4] Finding mismatches:   1%|1         | 2776/252224 [01:52<2:48:23, 24.69it/s]
/software/sge-2011.11/default/spool/fermat/job_scripts/9617950:
line 14: 11249 Segmentation fault      python3 main.py

When rerunning the same code, the program crashes reproducible at the same index +-2.

Segmentation faults in python code are rare, so I suspect a compiled code to be the origin of the segfault. I traced the logging message back to ngs-tools, which uses numba to speedup some private functions. Against my expectation, disabling numba compilation with NUMBA_DISABLE_JIT=1 did not prevent the segfault, so it seems that function is not the origin of the segfault.

I'd be glad about pointers how to make Cassiopeia work on my dataset. I looked into the bam file at the appropriate index, but the read looks like any other read. I'm currently running the pipeline on a another sample to see if that runs through. I thought about tracing with gdb to see where the segfault comes from, or using a separate package to do the cell-UMI correction, but I hoped that you have some insight how this problem could be fixed.

mattjones315 commented 1 year ago

Hi @syntonym, thanks for reaching out and for using Cassiopeia, I'm happy to help here.

This certainly sounds like an issue with our compiled code in the cellBC error correction step. I'm thinking this can be a versioning or systems issue since our current build is passing tests related to this module. Can you provide me with a bit more information so I can help track this down?

Thanks again and hope to fix this problem for you soon.

Best, Matt

syntonym commented 1 year ago

Hey Matt,

thanks for your quick answer. I executed the preprocessing testcases and got a segmentation fault after pytest finished (and some test cases fail):

FAILED test/preprocess_tests/call_lineage_groups_test.py::TestCallLineageGroup::test_basic_grouping - AttributeError: 'DataFrame' object has no attribute 'append'
FAILED test/preprocess_tests/call_lineage_groups_test.py::TestCallLineageGroup::test_doublet - AttributeError: 'DataFrame' object has no attribute 'append'
FAILED test/preprocess_tests/call_lineage_groups_test.py::TestCallLineageGroup::test_filter_reassign - AttributeError: 'DataFrame' object has no attribute 'append'
FAILED test/preprocess_tests/call_lineage_groups_test.py::TestCallLineageGroup::test_format - AttributeError: 'DataFrame' object has no attribute 'append'
FAILED test/preprocess_tests/call_lineage_groups_test.py::TestCallLineageGroup::test_reassign - AttributeError: 'DataFrame' object has no attribute 'append'
==================== 5 failed, 71 passed, 2 warnings in 25.79s =====================
Segmentation fault

I installed on a seperate computer and executing the preprocessing tests does not lead to a segmentation fault:

================= short test summary info =================
FAILED test/preprocess_tests/call_lineage_groups_test.py::TestCallLineageGroup::test_basic_grouping - AttributeError: 'DataFrame' object has no attribute 'a...
FAILED test/preprocess_tests/call_lineage_groups_test.py::TestCallLineageGroup::test_doublet - AttributeError: 'DataFrame' object has no attribute 'a...
FAILED test/preprocess_tests/call_lineage_groups_test.py::TestCallLineageGroup::test_filter_reassign - AttributeError: 'DataFrame' object has no attribute 'a...
FAILED test/preprocess_tests/call_lineage_groups_test.py::TestCallLineageGroup::test_format - AttributeError: 'DataFrame' object has no attribute 'a...
FAILED test/preprocess_tests/call_lineage_groups_test.py::TestCallLineageGroup::test_reassign - AttributeError: 'DataFrame' object has no attribute 'a...
FAILED test/preprocess_tests/resolve_umi_sequence_test.py::TestResolveUMISequence::test_filter_by_reads - ValueError: setting an array element with a sequence. ...
======== 6 failed, 70 passed, 2 warnings in 23.26s ========

so it seems there is indeed a problem with some dependencies. I did indeed deviate from the installation instructions and build Cassiopeia via poetry build and installed the resulting wheel into a virtual environment. I will double check my dependencies and try with a new virtualenvironment.

I'm running on:

mattjones315 commented 1 year ago

Hi @syntonym,

Great thanks so much. For what it's worth, you are getting those errors in the second build because pandas v2.0.1 has breaks some of the code in the preprocessing module. I noticed it last week and made some changes in a PR that is yet to be merged in (it includes other features). I'll move those pandas 2.0.1 changes over and merge it into our main branch.

As for the segmentation fault, we haven't built with poetry before so I wonder if it is installing different versions of dependencies, or is not compiling code correctly. The only package differences that I have here are pysam 0.20.0 and numba 0.56.4. Does downgrading to either of these versions fix the segmentation fault? What is different with the second computer that does not cause a segmentation fault?

Thanks, Matt

mattjones315 commented 1 year ago

Just a quick update - pushed code that fixes breaking changes in pandas>2.0.0 in #199

syntonym commented 1 year ago

I recompiled all dependencies, but the problem still remained on the first computer. As I can not reproduce the problem on the second computer with the same package versions, I suspect that something goes wrong when I compile either pysam or ngs-tools. As this is a problem with my compilation environment, there is nothing Cassoipeia can really do about this and I'll close the issue. I will try to get access to a different cluster.

Thanks for your quick and insightful replies!