Closed willblev closed 10 months ago
Hi @willblev,
thanks for reporting this issue! While I think I can easily find a workaround for the Biopython issue, the second part looks a little more sinister. Essentially, the objects structure of your Tracer output seems different to what scirpy expects.
What version of tracer are you using? And is there any chance you could send me the Tracer output folder of a single cell that fails? That would be extremely helpful for investigating.
Best, Gregor
Thanks for the speedy reply @grst!
Essentially, the objects structure of your Tracer output seems different to what scirpy expects.
Indeed, this is what I also believed to be the problem.
What version of tracer are you using?
I downloaded & installed TraCeR a few weeks ago (following instructions from their GitHub repo); the repo says the most recent version is v0.6, however it appears that it actually installs TraCeR v0.5, or at least that is how it shows up in my conda env list. I will look into this further.
And is there any chance you could send me the Tracer output folder of a single cell that fails?
I have attached the output of one cell which results in the error. Thanks for looking into this!
I started looking into this and it seems it is becoming increasingly difficult to keep the read_tracer function working.
pkl
files won't load with the latest version of Biopython. Other than I thought, monkey-patching the Biopython library doesn't work and I don't want to pin an older version of Biopython. cdr3nt
field, and the cdr3
field is "Couldn't find FGXG"
instead of "N/A"
in one case. That would be easy to fix, but I start wondering how many more of these edge cases exist. I'm not sure how to proceeed... maybe it would be worth packaging a simple script to convert tracer to AIRR with the appropriate Biopython version into a docker/singularity container and drop direct support for tracer from scirpy itself. For now I could just try to patch the issues and print out a message for the user to manually install Biopython 1.72
if they want to use read_tracer
.
Have you considered any other tools for TCR reconstruction? TRUST4 looks quite nice and is actively maintained, but I didn't yet have a chance to try it myself.
Thanks again for your time and for looking into this!
I inherited this project from a teammate so in my case, TraCeR had already been run (hence I did not consider other TCR reconstruction tools).
I understand the challenge of maintaining your package every time one of these functions breaks due to a deprecated function or a change in file structure... In my case, I may be able to get away with re-running the last step of the TraCeR pipeline (tracer summarize
) using an older version of TraCeR so that the .pkl files it generates will be in the format which Scirpy expects.
Which was the last supported version of TraCeR?
Which was the last supported version of TraCeR?
The version I have been successfully using was built using this Dockerfile based on the teichlab/tracer:latest
two years ago:
https://github.com/icbi-lab/smartseq2_pipeline/tree/master/Docker/tracer
However I'm afraid the base image got updated since and Docker purged our version of the container due to their savings measures.
I don't know how familiar you are with Python, but the easiest solution could be to patch the read_tracer
function yourself to ignore missing cdr3nt
and whatever else pops up:
https://github.com/icbi-lab/scirpy/blob/113ee731bf39c508a6cf049fd87e27fb93685811/scirpy/io/_io.py#L317-L319
This issue is getting quite old... if someone is still using tracer and has issues, please open a new one.
Hi! First off, thanks for developing such a useful package. I have been running into some issues while trying to import data from TraCeR into anndata objects using scirpy following the Scirpy tutorial.
As a quick comment, after creating a fresh Scirpy conda env, I initially got an error about Bio.Alphabet that more recent versions of Biopython stopped including Bio.Alphabet:
Rolling back to Biopython version 1.7.2 seemed to solve this. However, when I tried again to import the TCR data from TraCeR using the _io.readtracer function, I get the following error:
My Scirpy env is as follows:
Thank you in advance for your time and if there is anything else you need to know about my setup please ask away!