Teichlab / bbknn

Batch balanced KNN
MIT License
137 stars 23 forks source link

Details for pbmc dataset used #30 #33

Closed mohit1997 closed 3 years ago

mohit1997 commented 3 years ago

[Reopening Issue] Thanks for your quick reply! I am having trouble finding the 5' dataset on the 10X Genomics website. Is it no longer available? Can you share the link?

"The input data was downloaded from the 10X Genomics website. The exact 5′dataset was ‘PBMCs of a healthy donor 5′gene expression’, under Cell Ranger 2.1.0, under V(D)J + 5′Gene Expression. The exact 3′dataset was ‘8k PBMCs from a Healthy Donor’, under Cell Ranger 2.1.0, under Chromium Demonstration (v2 Chemistry)."

ktpolanski commented 3 years ago

I had a bit of a root around the 10x website to see if I could locate the 5' dataset, and this doesn't appear to be the case. It seems that they discontinued a lot of the 5' data for pre-3.x cellranger. This might be caused by the fact there was a major VDJ processing redesign in cellranger around that time frame, so they wanted people to see the new and improved version.

You could theoretically root around the various PBMC 5' data and do a comparison of the number of overlapping barcodes between the BBKNN object and the 10x download, but I can't imagine why you'd want to do this. The analysis was just meant to illustrate BBKNN overcoming a 3'/5' technical effect in 10x data, and the exact choice of 3'/5' data wasn't of much relevance. It doesn't help that it was conducted by someone who has left the lab. Still, the fact that it got streamlined out of the final BBKNN publication should be a further testament to its importance.