HeardLibrary / vandycite

0 stars 0 forks source link

Prep work for generating Nomenclature and Getty class hierarchies #74

Closed baskaufs closed 2 years ago

baskaufs commented 2 years ago

After meeting with Shenmeng on 2022-04-08, @baskaufs is going to look at using NetworkX to create a nx.Graph object using subject/object node values from SPARQL queries. See this page for a tutorial and this page for notes on how to generate the tree diagrams in an organized fashion.

baskaufs commented 2 years ago

@dannihuang830 is going to work on cleaning up the nouns column in the 3d_parts.csv spreadsheet so we can regenerate the thesaurus_ids.csv spreadsheet to use for generating the graphs.

baskaufs commented 2 years ago

Created test Colab notebook to generate hierarchical diagrams: https://colab.research.google.com/drive/1ayAiYmL77ZsqLU4DNkfDIr1P4y65HA9l?usp=sharing I couldn't get the PyGraphviz module to install locally, but it was already installed in the Colab environment.

baskaufs commented 2 years ago

Added cells to query_thesauri_for_descriptive_nouns.ipynb to generate and save the graph, which can be uploaded to the Colab notebook for visualization.

There's too many categories to put on one graph, though. nomenclature_diagram

baskaufs commented 2 years ago

Shenmeng suggested on 2022-04-12 to position labels using https://stackoverflow.com/questions/49368341/position-showing-of-labels-with-networkx-graphviz

baskaufs commented 2 years ago

LInk for Gephi supported formats https://gephi.org/users/supported-graph-formats/

baskaufs commented 2 years ago

Just saw a demo of a great Wikidata tool for exploring trees: https://www.entitree.com/ . For example https://www.entitree.com/en/subclass_of/Q4502142?0d0=d

baskaufs commented 2 years ago

Also see https://www.entitree.com/en/subclass_of/Q4502142?0d0=d&0u54=u&0u7=u

baskaufs commented 2 years ago

Completed this with https://github.com/HeardLibrary/vandycite/commit/1b290381b22d5ba4952c2f02baf3f6311a2e9f8d