AllenCellModeling / geneselection

only the best genes
MIT License
0 stars 2 forks source link

cardio data -- raw vs normalized counts #15

Open donovanr opened 5 years ago

donovanr commented 5 years ago

For the cardio data Tanya says:

Counts are divided by total counts from cell and multiplied by a scaling factor (10000), then log1p of that. Raw counts are also stored in the anndata object (I think in .raw)

I think we should use the raw counts and explicitly code the normalization so that we know exactly how to back out counts to compare to FISH down the line.

gregjohnso commented 5 years ago

Agreed

We should solicit for good ideas on how to normalize across FISH and transcriptomics data.

heeler commented 5 years ago

I'm confused are you looking to translate across experiments or are you looking for normalization strategies for the input data for the model. Normalization strategies all have their own quirks and biases.

donovanr commented 5 years ago