tweag / nixpkgs-graph-explorer

Explore the nixpkgs dependency graph
MIT License
15 stars 0 forks source link

Add etl process #9

Closed zz1874 closed 1 year ago

zz1874 commented 1 year ago

This is the ETL process including extracting the data from nix and loading it to Postgres via gremlin server. To test this process, run python etl/etl.py .

Since the dataframe fetched from nix is quite large (about 60k rows), I just took a subset of it to avoid gremlin_python.driver.protocol.GremlinServerError: 500: org.postgresql.util.PSQLException: ERROR: out of shared memory error.

dorranh commented 1 year ago

@zz1874 are there any pending changes? If not let's go ahead and land this

zz1874 commented 1 year ago

@zz1874 are there any pending changes? If not let's go ahead and land this

@dorranh Nope, I'll have it merged :rocket: