pixushi / tempted

10 stars 3 forks source link

Issue with toppct ratio/aggregate and bottompct ratio/agregate being identical #2

Open acortmann opened 1 month ago

acortmann commented 1 month ago

I've been applying tempted to my longitudinal dataset and generally am finding it very useful.

When running tempted_all and absolute=TRUE or FALSE, the list of OTUs included in the top and bottom lists appears to be identical for both the toppct ratio and bottompct ratio regardless of how absolute is set and regardless of the percentage set. The same happens with the metafeature aggregate lists if the values are set less than 1.

Based on reading the documentation, I thought this list would differ between the top and bottom. Looking at the included OTUs compared to the PC loadings, it looks like the included OTUs are only those with positive loadings.

Since the ratios in the metafeature_ratio and the metafeature_aggregate appear to make sense, it isn't clear if those are calculated using the correct OTUs or not.

Is the bottompct ratio/aggregate list pulling the wrong data from the analysis.

Thanks for your help.

pixushi commented 2 weeks ago

Thanks for your interest in our method!

In the current version, aggregate_feature() does not take the option of absolute=TRUE/FALSE, so metafeature_aggregate and toppct_aggregate are not affected by absolute=TRUE/FALSE. We will change it in the next update to include this option.

For ratio_feature(), it should return different results for absolute=TRUE/FALSE. I ran tempted_all() on the example data with absolute=TRUE and absolute=FALSE respectively, and it returned different results for metafeature_ratio, toppct_ratio, and bottompct_ratio. One possible reason for absolute=TRUE/FALSE to make no difference in your data is that the feature loadings are very symmetric around zero, so ranking them by their absolute values or signed values will lead to the same features being picked from the top/bottom of the list. Does this answer your question?