immcantation / presto

pRESTO is part of the Immcantation analysis framework for Adaptive Immune Receptor Repertoire sequencing (AIRR-seq). pRESTO is a bioinformatics toolkit for processing high-throughput lymphocyte receptor sequencing data.
https://presto.readthedocs.io
GNU Affero General Public License v3.0
0 stars 0 forks source link

Removing pandas from countMismatches in EstimateError.py #64

Closed ssnn-airr closed 6 years ago

ssnn-airr commented 6 years ago

Original report by Roy Jiang (Bitbucket: ruoyijiangyale, ).


cProfile suggests that indexing and slice operations related to using pandas dfs are slowing down countMismatches.

Removing pandas from estimateerror with either csv or numpy replacement.

ssnn-airr commented 6 years ago

Original comment by Roy Jiang (Bitbucket: ruoyijiangyale, ).


EstimateError.py still needs unit testing overall. Should be a noticeable performance difference with new branch EstimateErrorNoPandas.

ssnn-airr commented 6 years ago

Original comment by Jason Vander Heiden (Bitbucket: javh, GitHub: javh).


Also see Sequence.calculateSetError.