Closed dhimmel closed 5 years ago
New method that specifies source_paths
when calling load_archive
is speeding up _download_path_counts
quite a bit:
_download_hetionet_hetmat(self=<dj_hetmech_app.management.commands.populate_database.Command object at 0x7f439ec572b0>) ran in 0:00:00
_hetionet_graph(self=<dj_hetmech_app.management.commands.populate_database.Command object at 0x7f439ec572b0>) ran in 0:01:24
_populate_metanode_table() ran in 0:00:00
_populate_node_table() ran in 0:00:08
_populate_metapath_table() ran in 0:00:00
_download_path_counts(length=1) ran in 0:00:00
_populate_degree_grouped_permutation_table(length=1) ran in 0:00:00
_download_path_counts(length=2) ran in 0:00:00
_populate_degree_grouped_permutation_table(length=2) ran in 0:00:03
_download_path_counts(length=3) ran in 0:00:00
_populate_degree_grouped_permutation_table(length=3) ran in 0:00:31
_populate_path_count_table() ran in 0:16:27
Output from the database_info command is the same as above.
With https://github.com/greenelab/hetmech-backend/pull/11/commits/484fd901a12c530ce814f161949a70519494456f, the import ran to the AWS prototype database with the following times:
https://github.com/greenelab/hetmech-backend/pull/11/commits/c17c970761b3d729c356ac6635eaf5f4b1f84bc8 should speed this up even more.
At this point the output of
python manage.py database_info
is: