Open kamilkrukowski opened 2 months ago
Hi, thanks for your question. @martinkim0 we wanted to have a stratified validation/train splitter at some time point but never followed up on this. I guess this would yield a very similar result. Currently you can subset your object before running scVI.
Could scVI models support weighed loss and weighing individual cells or batch_key groups?
Some potential uses
Comments To my knowledge, oversampling/undersampling is not a viable solution here because scvi-tools requires in-memory AnnData objects containing the entire dataset. Is there any way to "stream" data loading with meaningful oversampling that does not require holding all oversampled entries in memory for the entirety of training?