BaselAbujamous / clust

Automatic and optimised consensus clustering of one or more heterogeneous datasets
Other
160 stars 35 forks source link

Different number of clusters and genes in each cluster based on the order of samples in Replicates.txt file #78

Open dadrasarmin opened 2 years ago

dadrasarmin commented 2 years ago

Hello,

I have normalized gene expression data from an alga experiment on a gradient stress table. To provide the background, I had a table with 42 samples. Temperature increases from left to right (seven columns) and light intensity increases from bottom to top (six rows).

I sought out gene clusters whose patterns of expression were consistent throughout my samples. To take into account several replicate files, I utilized the Replicates.txt file as instructed in the readme. "-n 101 4" was my choice for normalization.

Unexpectedly, if I rearranged the lines in my "Replicates.txt," I obtained varied numbers of clusters and the number of genes in each cluster. Based on how I arrange my samples on the x-axis (sort by temperature or light intensity, for example). I did vary the line order to obtain the same clusters with more "beautiful" patterns.

Is it naïve of me to assume that changing the lines' order in the Replicates file will produce the same clusters? I don't understand why that should have an impact on the result. Could you please guide me in this regard? How do I organize the lines in my Replicates if it influences clustering so that I receive the "correct" output?

best wishes Dadras, Armin

Clusters_profiles.pdf Clusters_profiles.pdf Clusters_profiles.pdf