Changes from PyGenePlexus

ChristopherMancuso commented 2 months ago

The items in here that changed are from commits 5722d9b, d07ce39 and 62cfc9a in the PyGenePlexus repo. Things that specifically changed that might need a UI tweak here

In both the probability and similarity tables there are columns from z-scores and p-values
The edge lists are now weighted again
using Mondo instead of DisGeNet for disease-gene annotations
In similarity table there is now a column called "Task" that describes the type of GSC it is from
changed how the auPRC values are returned, especially when they can't be calculated
Added a column "Gene Name" to the tables the show how the input IDs are converted
If training and showing results for different species, the probability data frame now can have known-novel and Class-Label columns and this information is based on one-to-one ortholog info. Can be used in network figure.

Other UI ask that could be good in this PR is possibly showing only StTRING as an option and then having a button a user clicks labeled "More Options" that would show the BioGRID and IMP networks. Similarity the GSC could be default to Combined and the user needs to click a button to see the individual GSCs. This is #44

I also added a script to gather and zip the data for the function. I uploaded the new data to the GCP bucket.

netlify[bot] commented 2 months ago

Deploy Preview for gene-plexus ready!

Name	Link
Latest commit	3ea2635be740d3762e6c7c9eb7dac04e5e14bce3
Latest deploy log	https://app.netlify.com/sites/gene-plexus/deploys/66ec4f39b10f00000806064b
Deploy Preview	https://deploy-preview-45--gene-plexus.netlify.app
Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

vincerubinetti commented 2 months ago

Getting an error when trying to run this locally:

"Error running GenePlexus:
'DisGeNet'
Traceback (most recent call last):
  File "/app/functions/ml/ml_deploy/main.py", line 58, in ml
    gp.make_sim_dfs()
  File "/usr/local/lib/python3.11/site-packages/geneplexus/geneplexus.py", line 458, in make_sim_dfs
    self.df_sim, self.weights_dict = _geneplexus._make_sim_dfs(
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/geneplexus/_geneplexus.py", line 253, in _make_sim_dfs
    Task = task_convert[gsc_full[termID_tmp]["Task"]]
           ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'DisGeNet'
"

I've already updated the frontend with all the changes, and there's no direct reference to DisGeNet anywhere in this repo. Fairly sure this is an upstream issue with the package.

ChristopherMancuso commented 2 months ago

Interesting. I checked and there isn't a direct mention to DisGeNet in PyGenePlexus either. That error is coming from trying to read a specific line in a GSC file. It reads the Task key in the json for a given term and then makes the Task key more human understandable (i..e Mondo becomes Disease). Have you updated all the backend files? When looking at the latest GSC files it seems like that Task key is indeed Mondo. If you have updated them all, can you let me know the parameters you were using so I can look at those GSC files?

vincerubinetti commented 2 months ago

My mistake, I forgot that the data was needed locally. Where can I find the updated data?

ChristopherMancuso commented 2 months ago

The data is uploaded to the google cloud as described here. Those are the tar files that you would have to untar and rename to data and put in with the google function. I think there was an automated way the repo could do this that @falquaddoomi set up? Do you need the link to the GCP project?

vincerubinetti commented 2 months ago

I was able to find the project in my GCP console and download the files. I think a script for doing that isn't super necessary. The readme that I neglected to read says the instructions clear enough. It works now that I've downloaded the data.

ChristopherMancuso commented 2 months ago

Awesome! I think the action that does the downloading of the GCP files is part of the larger process that auto deploys the cloud function.

vincerubinetti commented 2 months ago

https://github.com/krishnanlab/geneplexus-app-v2/pull/45/commits/6c7107d96ddb7526071dbc047065d3c0e692db3b:

update typescript types to match new backend: add gene name, disgenet -> mondo, add edge weight, add task/z-score/p-value
in network viz, make line thickness based on edge weight
account for cross fold nullish values
fixup readme

https://github.com/krishnanlab/geneplexus-app-v2/pull/45/commits/66a8e816f9eaa5fbc82109e38fab22c9ad8c1915:

rename "links" to "edges" in frontend to be consistent with backend
tweak theme colors, and use them in more places
add "min edge weight" slider to network

Do you want me to add task/z-score/p-value as table columns in this PR?

Other aforementioned changes -- like hiding parameter choices behind a "show more", comparing multiple species, allowing user to upload negative genes -- will require more work and thought and should go in a new PR.

ChristopherMancuso commented 2 months ago

It would be great if you could add the new columns for the tables (i..e task, z-score, p-values) in this PR. I agree tackling the three others should go into a new PR.

vincerubinetti commented 2 months ago

I believe I've added everything to the frontend that belongs in this PR, and I've reviewed your code (to the best of my ability), so feel free to merge if ready.

ChristopherMancuso commented 2 months ago

I was going to check out the updates but the netlify deploy failed. Is it possible to get that working before I merge? Is there a preference for using the netlify preview or me using run_local.sh to do final checks of the web server?

vincerubinetti commented 2 months ago

You'll have to run it locally, Netlify only runs the frontend not the full stack, so it's using the old (current) live cloud functions which are incompatible due to the changes.

ChristopherMancuso commented 2 months ago

@vincerubinetti is it possible for you to order the columns from both probability and similarity tables in the order that they are listed in the README.md file in the functions folder?

vincerubinetti commented 2 months ago

I've updated the column order to match the readme. I've also moved rank to be the first thing because that is what the tables are sorted by, and it would not be good to leave it as the last column as it was.

krishnanlab / geneplexus-app-v2