justincy / signs-of-the-second-coming

Exploring the order of the signs of the Second Coming of Christ as mentioned in scripture
https://www.greatdayofthelord.org/
1 stars 0 forks source link

Convert graph data into JSON #51

Closed justincy closed 3 years ago

justincy commented 3 years ago
justincy commented 3 years ago

I began work to import Relationships. I created the db file that will manage the data in memory and write to disk. However, I realized that my current graph lib doesn't have all the information I need. During the import process, I lose references that annotate the edges of the graph. 😬 So I need to decide how to handle that.

At first thought, it doesn't seem worth the work to write a custom import script for this data because we will throw it away. Whereas having a 1st-class concept of edges in the graph could be useful going forward when we generate, process, and interact more with graphs (but that's purely speculation).

justincy commented 3 years ago

It wouldn't be too hard to convert the graph into having edges as first-class citizens and removing relationship data from the nodes. But I need to think more through the consequences of that. Why didn't I do that in the first place? Are there any usage scenarios that work better with the current setup? Seems like the current situation is optimized for signs and generating information from them and nothing else. We can still do that if we move references to edges (where it rightly belongs).

justincy commented 3 years ago

Maybe it's time to find an existing graph library instead of writing my own. Surely one exists that's simple enough for my needs.

justincy commented 3 years ago

Thinking more about this, I realized that my current process looks like this:

graph.gv -> in-memory graph -> in-memory data store -> JSON file

I want to keep the in-memory data store because I'll need that going forward regardless, but I don't need the in-memory graph to be an intermediary step between the graph file and the in-memory data store. I can just modify the import utilities to return the data as-is and use a script that transfers them to the in-memory data store.

graph.gv -> in-memory data store -> JSON file

That will be more simple and solve the problem of losing scripture refs on the edges.

justincy commented 3 years ago

I actually already have the groups data in JSON.

justincy commented 3 years ago

What format should the synonyms be in?

Synonyms are primarily used for simplifying the graph. There is meaning in the relationship. a -> b means we merge a into b. Do I store that in the sign or in a separate file? If I store it on the sign, do I store it on the duplicate or on the synonym? Seems like storing it on the duplicate would make processing easier when simplifying the graph, but I could also see the advantage to wanting to do a lookup in the other direction. Maybe I should have a separate file that stores this data and build indexes on both sides of the relationship.

justincy commented 3 years ago

I've got all the new data in JSON, though I didn't create db libs for all of them.