airr-community / gold-standard-datasets

Reference AIRR-Seq datasets for benchmarking tools
0 stars 0 forks source link

Different levels of SHM #7

Open williamdlees opened 3 years ago

williamdlees commented 3 years ago

We should include both 'naive' and highly mutated datasets, so that the performance of tools can be tested uinder both conditions

matsohlin commented 3 years ago
matsohlin commented 3 years ago

PRJNA315079: Libraries of paired VH-VL from CD19+CD20+CD27−IgM-naive Bcells and from CD19+CD20+CD27+ antigen–experienced B cells

schristley commented 3 years ago

Hello @matsohlin , PRJNA315079 is loaded into VDJServer for the AIRR Data Commons for query and download. While we don't have the draft Receptor and other objects, we do have the receptor_id in the rearrangement record to link the two chains. The rearrangements have been minimally processed, and I'd be happy to work with you for any additional processing (clonal assignment, SHM tools, etc.)

matsohlin commented 3 years ago

PRJEB18926: Libraries of VH of IgM, IgG, IgA (and IgE, irrelevant for this purpose as clonal diversity is limited) of 6 subjects. Dublicate samples of bone marrow cells and PBMC. Amplified with BIOMED2 primers in FR1 and constant region primers.

matsohlin commented 3 years ago

PRJNA270742: Full length libraries of naïve and unsorted/PBMC cells may be of interest in this respect. The data set contains 7 subjects several of which may be appropriate for this investigation (data on these samples are available as P7 in VDJbase).