hubmapconsortium / ontology-api

The HuBMAP Ontology Service
MIT License
4 stars 3 forks source link

ontology-api: update neo4j and API #178

Closed AlanSimmons closed 1 year ago

AlanSimmons commented 1 year ago

After discussion with @shirey on Jan 5.

  1. Update the current production ontology neo4j instance with the latest CSV files in this folder.
  2. Update the current production ontology-API so that it has the same endpoints as the one in the ubkg repo.
shirey commented 1 year ago

Once the latest data/service have been released let's archive this, hubmapconsortium/ontology-api repo and only use the dbmi-pitt/ubkg repo moving forward.

yuanzhou commented 1 year ago

Neo4j PROD rebuilt using the Dec 6 version of the ZIP file

Step 9/12 : RUN ./neo4j-admin import --verbose --database=ontology --nodes=Semantic="${IMPORT}/TUIs.csv" --nodes=Concept="${IMPORT}/CUIs.csv" --nodes=Code="${IMPORT}/CODEs.csv" --nodes=Term="${IMPORT}/SUIs.csv" --nodes=Definition="${IMPORT}/DEFs.csv" --relationships=ISA_STY="${IMPORT}/TUIrel.csv" --relationships=STY="${IMPORT}/CUI-TUIs.csv" --relationships="${IMPORT}/CUI-CUIs.csv" --relationships=CODE="${IMPORT}/CUI-CODEs.csv" --relationships="${IMPORT}/CODE-SUIs.csv" --relationships=PREF_TERM="${IMPORT}/CUI-SUIs.csv" --relationships=DEF="${IMPORT}/DEFrel.csv" --skip-bad-relationships --skip-duplicate-nodes
 ---> Running in 0263faf143c3
neo4j 4.2.5
VM Name: OpenJDK 64-Bit Server VM
VM Vendor: Red Hat, Inc.
VM Version: 11.0.11+9-LTS
JIT compiler: HotSpot 64-Bit Tiered Compilers
VM Arguments: [-XX:+UseParallelGC, -Dfile.encoding=UTF-8]
Neo4j version: 4.2.5
Importing the contents of these files into /usr/src/app/neo4j/data/databases/ontology:
Nodes:
  [Concept]:
  /usr/src/app/neo4j/import/CUIs.csv

  [Semantic]:
  /usr/src/app/neo4j/import/TUIs.csv

  [Definition]:
  /usr/src/app/neo4j/import/DEFs.csv

  [Term]:
  /usr/src/app/neo4j/import/SUIs.csv

  [Code]:
  /usr/src/app/neo4j/import/CODEs.csv

Relationships:
  /usr/src/app/neo4j/import/CUI-CUIs.csv
  /usr/src/app/neo4j/import/CODE-SUIs.csv

  CODE:
  /usr/src/app/neo4j/import/CUI-CODEs.csv

  DEF:
  /usr/src/app/neo4j/import/DEFrel.csv

  STY:
  /usr/src/app/neo4j/import/CUI-TUIs.csv

  ISA_STY:
  /usr/src/app/neo4j/import/TUIrel.csv

  PREF_TERM:
  /usr/src/app/neo4j/import/CUI-SUIs.csv

Available resources:
  Total machine memory: 30.92GiB
  Free machine memory: 3.109GiB
  Max heap memory : 6.872GiB
  Processors: 8
  Configured max memory: 21.64GiB
  High-IO: true

Nodes, started 2023-01-18 19:01:14.886+0000
[*Nodes:0B/s 1.160GiB-------------------------------------------------------------------------]21.3M ∆2.35M
Done in 17s 425ms
Prepare node index, started 2023-01-18 19:01:32.320+0000
[*DEDUPLICATE:1.240GiB------------------------------------------------------------------------]89.3M ∆21.9M
Done in 9s 799ms
DEDUP, started 2023-01-18 19:01:42.171+0000
[*DEDUP---------------------------------------------------------------------------------------]    0 ∆    0
Done in 294ms
Relationships, started 2023-01-18 19:01:42.475+0000
[*Relationships:0B/s 1.240GiB-----------------------------------------------------------------]55.8M ∆1.27M
Done in 1m 8s 73ms
Node Degrees, started 2023-01-18 19:02:50.721+0000
[*>(2)================================================================|CALCULATE:1.195GiB(5)==]55.8M ∆19.9M
Done in 6s 60ms
Relationship --> Relationship 1-1754/1754, started 2023-01-18 19:02:56.927+0000
[*>--------------------------------|LINK(5)==========================|v:129.3MiB/s------------]55.8M ∆4.12M
Done in 15s 4ms
RelationshipGroup 1-1754/1754, started 2023-01-18 19:03:11.948+0000
[*>:72.74MiB/s------------------------------------------------------------------------|v:72.74]3.05M ∆3.05M
Done in 1s 13ms
Node --> Relationship, started 2023-01-18 19:03:12.974+0000
[>:81.29MiB/s--------|>(2)====================|LINK--------|*v:147.1MiB/s(2)==================]20.5M ∆19.7M
Done in 2s 145ms
Relationship <-- Relationship 1-1754/1754, started 2023-01-18 19:03:15.181+0000
[>-------------------------------|*LINK(5)=========================|v:120.7MiB/s--------------]55.8M ∆5.69M
Done in 15s 345ms
Count groups, started 2023-01-18 19:03:30.611+0000
[*>---------------------------------------------------------------------------|COUNT:1.036GiB-]3.05M ∆3.05M
Done in 434ms
Gather, started 2023-01-18 19:03:31.522+0000
[>-----|*CACHE:1.411GiB-----------------------------------------------------------------------]3.05M ∆1.50M
Done in 3s 512ms
Write, started 2023-01-18 19:03:35.049+0000
[*>:64.01MiB/s------------------------------------------------------------------------------|v]2.93M ∆1.82M
Done in 1s 94ms
Node --> Group, started 2023-01-18 19:03:36.181+0000
[>-------------------------------|*FIRST-----------------------------|v:??(2)=================] 158K ∆ 158K
Done in 676ms
Node counts and label index build, started 2023-01-18 19:03:37.040+0000
[>(2)=========================|*LABEL INDEX------------------------|COUNT:1.155GiB------------]21.3M ∆17.5M
Done in 2s 397ms
Relationship counts and relationship type index build, started 2023-01-18 19:03:39.452+0000
[*>(2)====================================|RELATIONSHIP |COUNT(4)=============================]55.8M ∆ 8.1M
Done in 4s 921ms

IMPORT DONE in 2m 31s 918ms. 
Imported:
  21292210 nodes
  55837226 relationships
  83141898 properties
Peak memory usage: 1.336GiB
There were bad entries which were skipped and logged into /usr/src/app/neo4j/bin/import.report
yuanzhou commented 1 year ago

Related https://github.com/hubmapconsortium/ontology-api/issues/179