Closed judyueshen closed 2 months ago
did you do PCA+Ridge? If you not, you can set it by uncommenting it here: https://github.com/mahmoodlab/HEST/blob/main/bench_config/bench_config.yaml
Ridge alone is doing pretty badly, probably because of the curse of dimensionality. As a consequence, models with different embedding sizes are not one on one comparable.
If you don't like PCA+Ridge, we also implemented random-forest and xgboost regression.
Ah, using PCA fixed it, I did not notice it was commented out! Thank you! Curse of dimensionality makes sense, do you happen to have the number of embedding dimensions of all the models in the benchmark results?
Not at part of the results. On top of my head:
Awesome! Thx!!
Hi, I used the tutorial notebook and was not able to reproduce the benchmark results "HEST-Benchmark results (08.30.24)" posted on the main page. For example, the ridge regression results are consistently lower, as attached. In this case I won't be able to directly use the table you provided to benchmark my own model. I wonder are there specific settings you used to generate the results, such as seed, etc? Thank you!