csoneson / alevinQC

Create QC and summary reports for Alevin output
https://csoneson.github.io/alevinQC/
Other
30 stars 6 forks source link

Report for 10X feature barcoding #21

Open ashuchawla opened 3 years ago

ashuchawla commented 3 years ago

Hi,

Does this work for the feature barcoding result files as well?

I got the following error message when trying with HTOs - Error in checkAlevinInputFiles(baseDir) : Input directory not compatible with Salmon v0.14 or newer (without external whitelist), the following required file(s) are missing or malformed:

Thanks,

Ashu

csoneson commented 3 years ago

Hi @ashuchawla, I have not actually tried to use alevinQC with feature barcoding data - I'm not sure whether all of the plots would still be useful. Could you paste also the rest of the error message (i.e., which file(s) are identified as missing or malformed by alevinQC)? Also tagging @k3yavi who might have some ideas from the alevin side itself. Thanks!

csoneson commented 3 years ago

👋🏻 @ashuchawla - I had a chat with @k3yavi and indeed, not all files required by alevinQC are currently exported by alevin when run for feature barcoding. As I mentioned, it's not clear whether the same type of plots are the most useful in this application - if you have any specific suggestions of aspects you'd like to be able to investigate, let us know.

ashuchawla commented 3 years ago

Hi @csoneson ,

Thank you so much for the reply, I figured that might be the reason. I was hoping to see some stats similar to what Cell ranger tells us as they run both GEX and HTO data together in the "count" program.

Thanks,

Ashu

k3yavi commented 3 years ago

Hi @ashuchawla ,

Thanks for raising the issue. You are right, ideally it'd be super cool if we can produce QC with combined feature barcodes and RNA-seq, given feature barcodes are rarely used in isolation (except probably with Guide RNAs) it makes even more sense to produce a combined QC. Unfortunately, alevin currently can't be run on the two different kinds of data (GEX & HTO) together and they have to be quantified separately which makes the problem of merging a bit complicated as the data has to be post-processed for example to figure out a group of shared cellular barcodes across the modalities. Sometimes, based on the feature barcode sequences used during the library prep, they have to customized by altering the sequence post-hoc to find the common sequences which makes the problem even harder. We are actively thinking about how best we can achieve the goal of producing combined QC but unfortunately it's still in early stages. We'd let you know once we make some progress and have stable version of the software but thanks again for raising an important point and letting us know you'd be interested in such use case.