greenelab / connectivity-search-backend

Django backend for hetnet connectivity search
https://search-api.het.io
BSD 3-Clause "New" or "Revised" License
6 stars 2 forks source link

dj_hetmech_app_node does not include in the code #74

Closed siyuniu closed 3 years ago

siyuniu commented 4 years ago

Hi, I followed the steps in the readme and try to connect the front end with the back end. However, though I could open page http://localhost:8000/v1/ I got error when trying to open http://localhost:8000/v1/nodes/ The error message is:

ProgrammingError at /v1/nodes/
relation "dj_hetmech_app_node" does not exist
LINE 1: SELECT COUNT(*) AS "__count" FROM "dj_hetmech_app_node"

I set up the docker and POSTGRES_DB locally. Is there anything I missed to implement?

dhimmel commented 4 years ago

Ah, I think it's probably that your database is empty. We populate the database using the commands at (running them inside the conda environment):

https://github.com/greenelab/connectivity-search-backend/blob/65a29f2d282e4f0faa23b978d9cd44cda0231081/dj_hetmech_app/management/commands/populate_database.py#L3-L7

Removing --reduced-metapaths will have the database import the same data we have loaded at https://search-api.het.io/v1/. However, it will take a very long time to populate (days). If you set --max-metapath-length=2, things will be populate faster and you won't have to download as large of files.

Locally, I have a database dump titled hetmech-pg_dump.sql.gz that is 5.5GB. Loading this into the database would be faster than repopulating from scratch. Still figuring out how best to share this file.

Thanks for the issue... we should add info to the README on how to populate the database.

siyuniu commented 4 years ago

Hi, I tried the code you provided, the first three lines run with no problem, however, when I tried line 6:

python manage.py populate_database --max-metapath-length=3  --reduced-metapaths --batch-size=12000

I got error of:

_download_hetionet_hetmat(self=<dj_hetmech_app.management.commands.populate_database.Command object at 0x1090876a0>) ran in 0:00:02
Traceback (most recent call last):
  File "/anaconda3/envs/hetmech-backend/lib/python3.8/site-packages/django/db/backends/utils.py", line 86, in _execute
    return self.cursor.execute(sql, params)
psycopg2.errors.UndefinedTable: relation "dj_hetmech_app_metanode" does not exist
LINE 1: INSERT INTO "dj_hetmech_app_metanode" ("identifier", "abbrev...

django.db.utils.ProgrammingError: relation "dj_hetmech_app_metanode" does not exist
LINE 1: INSERT INTO "dj_hetmech_app_metanode" ("identifier", "abbrev...
                    ^

It seems that dj_hetmech_app_metanode is not set up yet, could you tell me how to make it? Thanks a lot.

dhimmel commented 4 years ago

Hmm. Seems like the metanode database table has not been created properly. Can you provide the full output when you run the following two commands:

python manage.py makemigrations 
python manage.py migrate 

Tagging @dongbohu who also might have some insight.

dongbohu commented 4 years ago

What @dhimmel suggests makes sense to me. It might be helpful to use SQL statement to list the tables in the database too. (If you are using Postgres, psql is the command.)

siyuniu commented 4 years ago

hi, when I run the migrate command I get this: image When I run python manage.py populate_database --max-metapath-length=3 --reduced-metapaths --batch-size=12000 I get the error of image image

siyuniu commented 4 years ago

What @dhimmel suggests makes sense to me. It might be helpful to use SQL statement to list the tables in the database too. (If you are using Postgres, psql is the command.)

Hi, how to list the tables in the database? Do you mean the neo4j database or psql database?

dongbohu commented 4 years ago

@siyuniu: Based on the screenshots that you posted above, it seems that the models in dj_hetmech_app app are not included in the Django project at all. After you run python manage.py makemigrations command, does the folder dj_hetmech_app/migrations/ get created? If not, something must be wrong with your Django setup.

You can list all tables in Postgres by \dt after you launch psql command, like this:

psql> \dt

I noticed that you were setting up the Django project on macOS. If you can post the exact commands that you used, it will be helpful for me to do the troubleshooting. Thanks.

siyuniu commented 4 years ago

I managed to migrate the dj_hetmech_app. However, I met with another problem when I am downloading the database: image The database is partially downloaded when I checked the PostgreSQL: image

siyuniu commented 4 years ago

I also have a question about path implementation. Do those paths exist directly in the database or are they called with query every time? Are those clear paths showing in the frontend generated on the website in real-time? If I hope to use another neo4j database and also have these functions, how would you suggest to me to do it? I really appreciate the help you provide.

dhimmel commented 4 years ago

However, I met with another problem when I am downloading the database

Looks like the error is occurring within this function:

https://github.com/greenelab/connectivity-search-backend/blob/af12f8cf2ad47d9a25ce8d1b7889390654eb3bb9/dj_hetmech_app/management/commands/populate_database.py#L255-L273

This function downloads and loads several zip files. When I run the following ls command, I get:

$ ls -lh dj_hetmech_app/management/commands/downloads/zenodo/1435834
total 173G
-rw-r--r-- 1 dhimmel dhimmel  16M Nov 19  2018 degree-grouped-perms_length-1_damping-0.5.zip
-rw-r--r-- 1 dhimmel dhimmel 130M Nov 19  2018 degree-grouped-perms_length-2_damping-0.5.zip
-rw-r--r-- 1 dhimmel dhimmel 700M Nov 19  2018 degree-grouped-perms_length-3_damping-0.5.zip
-rw-r--r-- 1 dhimmel dhimmel 3.3M Nov 28  2018 dwpcs_length-1_damping-0.0.zip
-rw-r--r-- 1 dhimmel dhimmel  12M Nov 28  2018 dwpcs_length-1_damping-0.5.zip
-rw-r--r-- 1 dhimmel dhimmel 3.0G Nov 29  2018 dwpcs_length-2_damping-0.0.zip
-rw-r--r-- 1 dhimmel dhimmel  11G Nov 29  2018 dwpcs_length-2_damping-0.5.zip
-rw-r--r-- 1 dhimmel dhimmel  36G Nov 30  2018 dwpcs_length-3_damping-0.0.zip
-rw-r--r-- 1 dhimmel dhimmel 123G Nov 30  2018 dwpcs_length-3_damping-0.5.zip

What do you get? (FYI, it's a bit better to insert the raw text in a markdown code block rather than a screenshot, so it's easier for us to copy and search)

I also have a question about path implementation. Do those paths exist directly in the database or are they called with query every time?

The database does not store any paths. It stores types of paths between source and target nodes, where there were more paths than expected. We're writing a manuscript describing this work. It's still a work in progress, but will be helpful in explaining how things work.

If I hope to use another neo4j database and also have these functions, how would you suggest to me to do it?

It would be possible, but would take a lot of work. All of those zip files above contain connectivity measures that were computed for Hetionet. So you'd have to recompute those. You'd also have to tweak parts of this codebase that are Hetionet specific. To do this, I'd fork this repository and beginning changing things for your network. Most of this repository is general, but unfortunately their's still a decent amount of customization that must be provided to apply connectivity search to a single network.

If you want to create something like the https://het.io/search/ webapp for another network, you will need to fork and modify the following repos:

This is certainly possible. And we're interested in any parts of our codebase that could be generalized to more easily accommodate applying it to other networks. But it would likely take considerable full-stack familiarity to do it in a reasonable timeframe.

If you would just like to compute several of the measures, but without a fancy UI, you should directly use https://github.com/hetio/hetmatpy.

siyuniu commented 4 years ago

Thanks a lot for all the information. I really appreciate your help. This is really a cool project. I didn't realize that there are such complicated back-end and database processing behind the simple and user-friendly interface. It would be great if this could be used by other neo4j databases.

dhimmel commented 3 years ago

I managed to migrate the dj_hetmech_app. However, I met with another problem when I am downloading the database

I also encountered this error in https://github.com/greenelab/connectivity-search-backend/pull/79#issuecomment-758355345. It has to do with the neo4j-python-driver encountering an error when it closes its connection. Luckily this occurs after the database has been fully imported, such that it can be ignored.

Closing this issue, but feel free to open new issues or continue commenting.