Raw data cause overestimated performance when measured by PCC - Githubissues

QuKunLab / SpatialBenchmarking

BSD 2-Clause "Simplified" License

107 stars 26 forks source link

Raw data cause overestimated performance when measured by PCC #8

Closed Czh3 closed 1 year ago

Czh3 commented 2 years ago

Thank you for providing this great benchmark. It's very interesting that the performances enhanced when RAW data as an input (in your paper fig S4). I tested the performance of SpaGE in dataset9&7. It seems outliers dominate the PCC. Take Nos1 as example, several cells show extremely high expr, which results PCC=0.76 (Left panel: Raw data, Right panel: normalized data). The normalized ones are more accurate and informative. After removing 8 outliers, the PCC declines to 0.29.

In dataset7, the PCCs using RAW data shows very high. But the expression patterns don't look similar. Huge differences are observed. That means the PCCs using RAW data do not represent the imputation performance. The normalized data is more appropriate. The fig2 and fig S2 are based on raw data, I am wondering which method show best performance when using normalized data (fig S4 is hard to compare among different methods). And, could you upload the evaluated Metrics based on normalized data (similar as what your provide in spatialbenchmarking/FigureData/Figure2/Metrics/Data1/).

wenruyustc commented 2 years ago

Hi, thanks for your suggestion! In fact, in order to avoid the influence of data input, we compared the results of standardized and non-standardized data, and the results showed that Tangram, SpaGE, gimVI, SpaOTsc are relatively good methods(sFig5). Therefore, without affecting the conclusion, we use raw data result as presentation. But as you said, the results for some genes may not be statistically significant, so what we used the average accuracy of the data predictions. And, it may be helpful to scale the data so that the range of values in the raw data is not as large when displaying the raw data result. Archive.zip

The attachment is the metrics file of the above two data.

Czh3 commented 2 years ago

Thank you very much. The data is helpful for understanding the performance, however, it only contains 2 datasets. Could you upload the metrics for all datasets. If you can upload the raw PCC values for plotting Fig S4, that will be the best.

wenruyustc commented 2 years ago

Hello, Hello, The attachment is the metrics file based on normalized data for all datasets. And the data for plotting Fig S4 was uploaded as source data with our published papers. NormlizationMetric.zip

qiaochen commented 2 years ago

Thank you for providing the results on normalized data, but results for Data8 and Data6 are missing?

wenruyustc commented 2 years ago

Data8 and Data6 have been normalized data when we download

Czh3 commented 2 years ago

Thank you very much. This helps me a lot. The LIGER files are miss in Data2 and 11, gimVI miss in Data3&5, SpaGE miss in 12.

wenruyustc commented 2 years ago

Hi~ It occurred error when we run this dataset for LIGER or gimVI

qiaochen commented 2 years ago

Hi, the assumption of Pearson's correlation coefficient might be causing the problem raised in this issue. Regarding this, I wonder if metrics like the Spearman's Rank Correlation Coefficient (scipy.stats.spearmanr ) would be a better measurement, as have been used in the evaluation of SpaGE and stPlus paper?

wenruyustc commented 2 years ago

Hello~ Previously, the results for these two metrics are not significantly different, but just on one dataset. I will check the Spearman's Rank Correlation Coefficient on all other datasets when I go back to school in a few days. Thank you for your suggestion.