autogram-is / spidergram

Structural analysis tools for complex web sites
GNU General Public License v3.0
108 stars 5 forks source link

Refactor Arango class #9

Open eaton opened 1 year ago

eaton commented 1 year ago

A couple of issues with the current Arango class need work:

Worth considering whether to use interface tricks to add helper methods to the database class, move all the current Arango class methods to a separate helper class, etc.

eaton commented 1 year ago

Recent refactoring on the 3.0-dev branch (now merged into main) has addressed some of the first issue — the Arango class is now 'ArangoStore' and it's been reworked to follow the semantics of Crawlee's DataStore and KeyValueStore classes.

The intent of matching Crawlee's semantics with the static open() method and blind push() methods is to make moving between the different data storage tools easier; in theory Spidergram entities can be saved to any datastore that supports serialized JSON data.

Still undecided: Initializing the Arango database with the desired list of classes is still TBD. An 11ty-style config initialization process that sets up the initial Spidergram Context could include an explicit array of entity Types, whose metadata/meta functions could supply the initialization information. We'll have to experiment.

eaton commented 1 year ago

Update — the Project class now takes options for model entities in its 'graph' section. Although they're not actually passed on to ArangoStore yet, they're one piece of more reliable setup and configuration of new graphs. We need to:

  1. Capture graph entity information (collection name, vertice or graph identity, indexes, Arango validation information, JSON transformer/constructor functions, etc) on the class metadata.
  2. Pass an array of classes into ArangoStore's initialization code rather than hard-coding our core entities. Iterate over those classes' metadata to determine what collections, indexes, etc should be created to set up a new DB.
  3. Use the incoming _collection property of an Arango JSON payload, and the class metadata information, to determine which constructor to call when rehydrating results.