WormBase / pseudoace

Modelling the WormBase ACeDB database in datomic.
4 stars 3 forks source link

Store generated schema as raw datomic schema in EDN #65

Open mgrbyte opened 7 years ago

mgrbyte commented 7 years ago

This will remove a level of abstraction that isn't really necessary. The idea is to consolidate the notion of schema in pseudoace, in both code and data terms.

Once all this is done, then there will be no need for resolve-xref-info function that I've previously provided, as the information will then be able to be directly inferred from the schema.

mgrbyte commented 7 years ago

Agreement that this is a good way to go will mean I can close #55 and delete the schema-xref-support branch which has been languishing.

It means not having to code around the problem (which isn't really a problem once we remove the abstraction!)

adamjohnwright commented 7 years ago

I agree with these changes.

mgrbyte commented 7 years ago

here's an example of what I mean by a "raw" datomic schema.

adamjohnwright commented 7 years ago

I think that is much more readable and extendable than the current schema format

azurebrd commented 7 years ago

I imagine it would make more sense if I saw the raw version of the WS datomic schema, but it sounds good. To make sure I understand, are we talking about changing the schema257.edn format into a fuller format with more information ? I imagine that depending on how much Caltech (and other) curators want to be involved in the modeling of their datatypes, they'd want access to something relatively understandable, but as it is, I don't think they find the current .edn understandable, and we'd keep this format anyway in resourcecs/schema [sic ?] There hasn't been much talk on how modeling will work, but this sounds like it may be downstream of that.

sibyl229 commented 7 years ago

@mgrbyte +1 for having the :pace meta-schema in the generated schema. It's very helpful

mgrbyte commented 7 years ago

@sibyl229 @a8wright @azurebrd Here's an example of what the schema will end up looking like.

pseudoace-issue-65-example.zip

Notes

Questions

Formatting of the EDN file could be improved.

** potentially change markup in the EDN using some other reader tag than meta if desired.

sibyl229 commented 7 years ago

@mgrbyte this doesn't change how schema is stored in Datomic once its loaded right?

I like the new format! Flat and easy to parse. Also, I think as long as :pace/use-ns and :pace/obj-ref show up with the attribute, it's great 👍

mgrbyte commented 7 years ago

@sibyl229 Yes, we have to preserve how the schema is stored in Datomic 1-to-1 with how it is now, proposal won't change that; and accordingly :pace/use-ns, :pace/obj-ref (and all other :pace* schema items will remain as it is today in the db.

This proposal only affects:

azurebrd commented 7 years ago

@mgrbyte @sibyl229 Cool cool.

Yeah, I also like being able to see the :pace/use-ns and :pace/obj-ref. It's great being able to see all the possible attributes and how they relate, very useful.

I liked that the current/previous .edn version had indenting and grouped the attributes for a given datatype together, while it's not as visually easy to scan until the next newline that begins with : and some related schemata are in separate sections. Possibly I should be using http://datomic-rest-dev.wormbase.org:8888 instead of the schema###.edn file, I just got used to it, and it's easy to search through and jump around. Certainly possible to use both .edn schemata files =)

sibyl229 commented 7 years ago

@mgrbyte that's great! @azurebrd Maybe you could query the schema in Datomic directly? For example: https://github.com/Datomic/day-of-datomic/blob/master/tutorial/schema_queries.clj

azurebrd commented 7 years ago

Thanks @sibyl229 Direct queries would be nice. The :find queries work well at http://datomic-rest-dev.wormbase.org:8888/browse but I don't have access anymore to make queries through lein repl to make the other kind of queries. It may be nice to sometime, but it's not been necessary so far. Do you make all your queries through that tool and the .clj files through datomic-to-catalyst, or do you ever query through some other way ?