dhimmel / learn

Machine learning and feature extraction for the Rephetio project
https://doi.org/10.15363/thinklab.d210
4 stars 5 forks source link

Updates and Results from Changes in all-features/7-transform.ipynb #14

Open maggielee1111 opened 10 months ago

maggielee1111 commented 10 months ago

Hi Daniel! Due to recent updates in R libraries, I've made several modifications to all-features/7-transform.ipynb and the result seems a little different. Following are the changes I made.

  1. added degree_transformer = to_fxn(params['degree_transformer']) since I assumed the project is trying to evaluate transformation for degree_dftoo.
  2. changed the funs function to (across(everything()due to funs being deprecated.
  3. changed "rbind_all()" to "bind_rows()" since was removed in dplyr [(https://github.com/tidyverse/dplyr/issues/4430)]. And in the transformation-sweep.tsv I got, the first several lines transformation are
    degree_transformer  dwpc_scaler dwpc_transformer    alpha   auroc   auroc_lower auroc_upper auprc   auprc_lower auprc_upper
    asinh   mean    log1p   0   0.9941835007236502  0.9938819015259823  0.9944850999213182  0.9805812551344717  0.9797400253392432  0.9814224849297002
    log1p   mean    log1p   0   0.9939196526468115  0.9936136156188465  0.9942256896747764  0.9799278139925953  0.9790933266429557  0.980762301342235
    asinh   mean    asinh   0   0.9939028551379303  0.9936457501750738  0.9941599601007868  0.9798320302118215  0.9791166118524781  0.9805474485711649
    asinh   mean    log1p   1   0.9936890048682038  0.9933479888988564  0.9940300208375511  0.977249912523295   0.9761768963653875  0.9783229286812024
    asinh   mean    asinh   1   0.9935960703477849  0.9932037859475018  0.9939883547480679  0.9773938761438711  0.976172127586398   0.9786156247013441

    Considering these outcomes, do you find the results acceptable? If so, I plan to initiate a pull request. I would appreciate any feedback or insights.

dhimmel commented 9 months ago

Hi @maggielee1111, sorry for the slow response. I think the PR is helpful because if anyone else is looking to run the code, it could save them a lot of time!

I am curious how different the results are. Perhaps we'll be able to see in the PR commit diffs.

maggielee1111 commented 7 months ago

Hi Daniel! I just start a PR. I added some changes in 7-transform.ipynb like I mentioned in my last comment, and also replaced the result of transform with the result I got. When I looked into the data, I found out the input I use primary-aurocs.tsv has a different value from summary_df = readr::read_tsv('data/auroc.tsv'). I guess that is the reason I got a different result.