scverse / scirpy

A scanpy extension to analyse single-cell TCR and BCR data.
https://scirpy.scverse.org/en/latest/
BSD 3-Clause "New" or "Revised" License
217 stars 34 forks source link

Sequence Logos #12

Open grst opened 4 years ago

grst commented 4 years ago

In GitLab by @grst on Jan 24, 2020, 14:06

Here's a quite recent re-implementation of sequence logos in Python that looks promising: https://github.com/jbkinney/logomaker

We will also require an algorithm for multiple sequence alignment in addition to the pairwise one that we have already.

Things to discuss:

grst commented 4 years ago

In GitLab by @szabogtamas on Jan 24, 2020, 20:08

marked the task What sequences to use (TCRA/B primary only?) as completed

grst commented 4 years ago

In GitLab by @szabogtamas on Jan 24, 2020, 20:08

marked the task What sequences to use (TCRA/B primary only?) as incomplete

grst commented 4 years ago

In GitLab by @szabogtamas on Jan 24, 2020, 20:13

For sequence logos I would consider the primary alpha+beta only. The only exception is groups made with tcrdist maybe. Here, I would want to show a sequence logo of the closest chains that are not necessarily the primary alpha and beta.

grst commented 4 years ago

In GitLab by @szabogtamas on Jan 24, 2020, 20:14

As to the lacking chains: I would just mark them with 'XXXXXXXXXXXXXXXX' just like for sequence alignment. But this is only an initial idea, maybe we can come up with something better later on.

grst commented 4 years ago

Possibly also include V & D genes into sequence logo.

Example from Dash et al. (2017): image

grst commented 8 months ago

in sc-best-practices, they use the palmotif library. It looks pretty straightforward https://www.sc-best-practices.org/air_repertoire/clonotype.html#motif-sequence-analysis