Alevin- appropriate to run separate lanes and run 'quantmerge'?

pinin4fjords commented 5 years ago

Hi,

Quick question- if I have the reads for a library spread across multiple files, is it appropriate to run Alevin separately on each file pair and combine with quantmerge, rather than processing together? I'm looking to run Alevin via Galaxy for some training, and the available wrapper doesn't currently allow supplying multiple inputs. My feeling is that all files should be processed together for robust thresholding etc, but I may be worrying about nothing.

k3yavi commented 5 years ago

Hi @pinin4fjords , Thanks for raising an important question and running alevin for the training. I think there is a confusion regarding the quantmerge command. That command works only with bulk RNA-seq quants not with alevin output. To answer your question of running multiple alevin instance for multiple file pair, might depend on what are the separate files from, are they from separate lanes or are they separated based on cellular barcode ? The basic intuition is after initial barcode assignment, alevin works on each cell disjointly meaning as long as you are confident that each file pair is cell disjoint then at the end you can just cat the output of the alevin quants. Also, depending on what's the training about you can think of multiple workarounds like you can use very small 100 cell (7 million reads) datasets from 10x and combine it all together in one file if size and multiple files is a problem.

pinin4fjords commented 5 years ago

Thanks @k3yavi for the clarification. In my example case the files are not cell disjoint, being multiple lanes run from the same library. Obviously I can use just one lane for the training, but to be clear: in the real world in this situation all files for a library need to be run together, right?

k3yavi commented 5 years ago

Yes that's correct, all Lanes of the library should be run together with Alevin.

k3yavi commented 5 years ago

I think the issue is answered here, feel free to reopen.

COMBINE-lab / salmon

Alevin- appropriate to run separate lanes and run 'quantmerge'? #434