frisen-lab / TREX

Simultaneous lineage TRacking and EXpression profiling of single cells using RNA-seq
MIT License
5 stars 6 forks source link

Adjust minimum length if --per-cell is used #51

Open marcelm opened 1 year ago

marcelm commented 1 year ago

I’m not sure we talked about this already, but with per-cell cloneID correction, it appears to me that the minimum required length for a cloneID can be a bit lower. For example, there is a cell in the test dataset that has these cloneIDs:

GATGACTATACCATTTATTGACCGGCGTAC
---------------TATTGACCGGCGTAC
-------------------GACCGGCGTAC

The second and third cloneID are very incomplete (15 and 11 bases), but it’s quite obvious that they represent the same as the first. Have you taken this into account already?

I was considering this scenario. In this case, only the last bases are compared and they are all the same, so they should be corrected to the most complete version.

Originally posted by @acorbat in https://github.com/frisen-lab/TREX/issues/36#issuecomment-1764096615

acorbat commented 1 year ago

This one might be useful as well.