scverse / scirpy

A scanpy extension to analyse single-cell TCR and BCR data.
https://scirpy.scverse.org/en/latest/
BSD 3-Clause "New" or "Revised" License
220 stars 34 forks source link

Integrate TCRdist3 #461

Closed grst closed 7 months ago

grst commented 1 year ago

Another option towards #304.

TCRdist3, or rather its dependency pwseqdist (MIT licensed) comes with a numba-implementation of TCRdist3 distance metrics: https://github.com/agartland/pwseqdist/blob/master/pwseqdist/nb_metrics.py

It would be easy to convert this into a DistanceCalculator. Allegedly this is faster than using parasail, and it seems it's less complex than using a full sequence alignment, for instance, there are no affine gap penalties.

Maybe this could even be ported to make use of numba's CUDA kernels to run on GPU.

grst commented 11 months ago

@felixpetschko, since francesca mentioned you have been playing around with metrics as well, maybe this is something you would find interesting. The TCRdist metric is more relevant than levenshtein distance in practice. It's somewhat comparable to the alignment distance that Tobias has been working on, but might be even faster because it simplifies the alignment problem.

First step would be to get the code from nb_metrics.py to work inside scirpy. As a next step it would be interesting if this works on GPU using numba's GPU features (or any other framework - as you like).

I'll provide feedback on the clonotype clustering later -- which is also still the priority right now. If we can't speed that up, speeding up the distance metrics is in vain.

ShihanL commented 9 months ago

Hi @grst Is anyone working on implementing this at the moment. If not, I would be interested in having a crack at it.

grst commented 9 months ago

I think @felixpetschko has this on his list -- Felix, is that still the plan?

felixpetschko commented 9 months ago

@grst @ShihanL Yes, I am working on this :)

grst commented 7 months ago

Closed in #502.

This is in the main branch and will be part of the next release. @ShihanL, if you want to give it a try already, you can install the development version using

pip install git+https://github.com/scverse/scirpy.git@main