theislab / scarches

Reference mapping for single-cell genomics
https://docs.scarches.org/en/latest/
BSD 3-Clause "New" or "Revised" License
323 stars 50 forks source link

Speed improvement #218

Open moinfar opened 8 months ago

moinfar commented 8 months ago

Speed improvement in:

  1. Data Splitting (scPoli when cell_types are available)
  2. Ptorch data loader: Now we fetch cells in batches when using MultiConditionAnnotatedDataset rather than cell-by-cell and then combining them in custom_collate. This may result in significant dataloader speedup, especially when loading sparse data.
jan-engelmann commented 4 months ago

Hi! what's the status here? can this be merged?

Koncopd commented 3 months ago

@moinfar sorry for being slow on this. Have you rerun the scpoli tutorials with this PR to check that everything is fine?