BaselAbujamous / clust

Automatic and optimised consensus clustering of one or more heterogeneous datasets
Other
161 stars 36 forks source link

Explain how automatic normalization works #26

Closed apcamargo closed 5 years ago

apcamargo commented 5 years ago

Hi Basel,

I've been using Clust a lot lately and found that sometimes the automatic normalization applies different normalization methods for somewhat similar datasets. eg.: Dataset A has more zeros than dataset B. Dataset A is normalized by 101 31 4 and dataset B is normalized by 101 3 4.

I understand the reasoning for these choices, but I think that it would be better if the automatic normalization was described in detail in the documentation.

BaselAbujamous commented 5 years ago

Hi

Thanks again for your very valuable comments and contributions to clust. I am trying to keep the documentation as understandable as possible for biologists who don't want to bother understanding the backend details. Therefore, I would rather leave the automatic normalisation section in particular simple. However, I have added this description to the Codes suggested for commonly used datasets section:

Based on these, if your data is recommended to use one of the codes which include the code 3, but the dataset has too many zeros or some negative values, it is recommended to use 31 in the place of 3. For example, if you have a one-colour microarray data with too many zeros or few negative values, use 101 31 4 instead of 101 3 4.

apcamargo commented 5 years ago

Thanks, Basel!