immunogenomics / presto

Fast Wilcoxon and auROC
144 stars 33 forks source link

Normalize Data or Not before using presto? #33

Closed NeuralBind closed 3 months ago

NeuralBind commented 3 months ago

Hello, i know that the wilcoxon rank sum test does not assume any distribution, is it better to use the normalized scaled data or just the raw counts? cheerz!

slowkow commented 3 months ago

If you are looking for genes that might be good markers for cell clusters, then use a normalized value like logCPM.

In general, it is always a good idea to run a simulation with fake data to understand the consequences of different analysis strategies. Once you understand what the consequences are, then you can make an informed decision about what steps you want to use in your analysis (i.e. scaling, normalization, and so on).

Github issues pages are intended for discussing the code in a repository, not for general questions.

The best place for general questions and discussions is Biostars or Bioconductor or Stackoverflow. There, you will find that many experts are ready to answer your questions. There is a good chance that your question has already been asked and answered, so please try using the search functions on those pages.

Good luck!