Open bcorrie opened 3 years ago
@wyattmcdonnell I guess a follow on question would be which fields from the 10X filtered_contig_annotations.csv does the program use? It would probably be pretty easy to convert an AIRR Clone format to this format.
Hello! So far I haven't used it with any data other than 10x or BD data. It might be a little annoying to convert to 10x format, because in the contig_annotations.csv each row is a single chain instead of of a single cell, which I think is what the AIRR format is, but you can totally give it a go. The columns that are used are "barcode", "chain", "cdr3", and then either "cdr1" and "cdr2" or "v_gene". I've been wanting to add more input formats, and maybe change it up so it takes pandas dataframes to make this kind of conversion easier, but unfortunately I haven't had much time now that school has started. Let me know how everything goes!
Thanks - will look into this. We plan on using bcr_dist on some 10X data, so we can use that directly, but I am thinking about how to generalize its use on data from other sources, in particular data that comes from paired chain data in the AIRR data commons.
Trying to get this installed, and there seem to be quite a few python dependencies that need to be met to use the 10x_test.py code.
Are these listed anywhere. I am installing them one at a time as I get python import errors which is a bit painful 8-(
FYI - from a fresh python virtualenv (python 3.8) this is what I needed to load:
pip install matplotlib pip install pandas pip install scikit-learn pip install umap-learn
It now seems to be working on our 10X data...
Have you used bcr_dist with non 10X data? I see you can load a "BD" file, I am not familiar with this format...
Hoping to be able to map the emerging AIRR Clone format (https://github.com/airr-community/airr-standards/blob/8e07bd75736c3c32e926f08b6042709940794ded/specs/airr-schema.yaml#L3660) into a file that can be loaded by bcr_dist.