Carbon-Farm-Network / holochain-facets

Faceted classifications as a plugin DNA for any data source with unique identifiers.
Apache License 2.0
3 stars 0 forks source link

Implement facets zome / DNA as a mix-in module #1

Open pospi opened 1 year ago

pospi commented 1 year ago

We should develop this module in isolation and genericise for applicability to any Holochain apps.
(For reference, see also draft specification here.)

Design constraints to consider:

API sketch below (please edit / suggest), where


GraphQL mutations to implement as zome API endpoints in extension-resolvers.ts:

GraphQL query APIs & sub-record resolvers to implement in extension-resolvers.ts:

GraphQL zome API methods we could optionally implement:

GraphQL type field resolvers will need to be implemented in order to wire up a complete solution that lets us traverse "downward" from all FacetGroups to possible FacetValues; as well as "upward" from Organization and ResourceSpecification records back to associated FacetValues etc. Here is a quick spec to match with the above:

type FacetGroup {
  id: ID!
  revisionId: ID!
  name: String!
  note: String
  facets: [Facet!]!
}

type Facet {
  id: ID!
  revisionId: ID!
  group: FacetGroup!
  name: String!
  note: String
  values: [FacetValue!]!
}

type FacetValue {
  id: ID!
  revisionId: ID!
  facet: Facet!
  value: String!
  note: String
}

type FacetValueAssignment {
  revisionId: ID!
  value: FacetValue!
}

type Organization {
  facets: [FacetValueAssignment!]!
}

type ResourceSpecification {
  facets: [FacetValueAssignment!]!
}

type Query {
  facetGroups: [FacetGroup!]!
}

Note the id and revisionId fields in each record to populate with string-encoded EntryHash and ActionHash from the zome APIs. FacetValueAssignment.id is not defined since this metadata is unimportant to that record type (it's only retrieved via filtering on identifier and facet_value_eh).

I'm also making the call in this API design to do away with pagination and Relay data structures. I don't think we really need it- there are unlikely going to be hundreds of FacetGroups in most applications, let alone this smaller-scale bioregional textile network.

pospi commented 1 year ago

@fosterlynn I think we'll need to give some consideration to facet updates. I figure we will probably also want a deleteFacetValue. But what about updateFacetValue? And how many FacetValues are allowed to be assigned to each record? Just one per Facet? One per agent per Facet? Or as many as they feel like categorizing?

pospi commented 1 year ago

@LeosPrograms I can recommend the Neighbourhoods sensemaker as a good template to follow for setting up this new repository. Latest on this branch (or main if it's been merged) has useful boilerplate for a TypeScript integrated test runner and Rust workspace scaffolding.

There is also code for creating & testing any JavaScript client libraries but I don't think we will need those here, will probably do direct GraphQL bindings in hREA. Or maybe in the Carbon Farm app. Need to have a think about upstream but my sense is that implementations standardising different non-VF extensions is fine; Bonfire are doing it with all the social extensions.

pospi commented 1 year ago

For indexing helpers I can also recommend the helper libs in hREA, which aren't published to crates.io but you can still use them with

hdk_semantic_indexes_zome_lib = {git = "https://github.com/h-REA/hREA", tag = "happ-0.1.3-beta", package = "hdk_semantic_indexes_zome_lib"}

etc.

There is only one example of a string-based index in the codebase: Agent types. See the index definition zome (paying attention to the _internal definition) and corresponding calling code in the Agent creation logic.

fosterlynn commented 1 year ago

TODO: should we delete any stored FacetValues matching the given option as well?

Possibly more kind is to add a deletable method to FacetValue that returns false if any AgentFacetValues or ResourceSpecificationFacetValues exist. And also check that when trying to delete.

I think we'll need to give some consideration to facet updates.

Hmmm. Yes. I think my opinion is that we support updates, and trust the user to not change the meaning. Other opinions welcome. [EDIT: Changed my mind, see below.]

And how many FacetValues are allowed to be assigned to each record? Just one per Facet? One per agent per Facet? Or as many as they feel like categorizing?

Just one FacetValue per Facet per record. [EDIT}

pospi commented 1 year ago

If we need updates then I would also recommend following the pattern used in the hREA zomes, via import of

hdk_records = {git = "https://github.com/h-REA/hREA", tag = "happ-0.1.3-beta", package = "hdk_records"}

fosterlynn commented 1 year ago

If we need updates...

Actually, updates can wait, they aren't a priority for this release. In a pinch, they can remove, delete, re-add everything. And there won't be a lot of data for a while.

[edit: changed my mind, let's do updates]

pospi commented 1 year ago

Updated the proposed API to reflect the addition of FacetGroup per @fosterlynn's outline.

pospi commented 1 year ago

Also have been reflecting that these types of modules are fine - https://github.com/Carbon-Farm-Network/holochain-facets/issues/1#issuecomment-1529313392

...but for something like this it might be much simpler and more efficient to manage your own link structures rather than trying to use generic record management libraries like hdk_records or hdk_crud.

fosterlynn commented 1 year ago

Actually, updates can wait, they aren't a priority for this release. In a pinch, they can remove, delete, re-add everything. And there won't be a lot of data for a while.

As we worked on the UI, I changed my mind back, and we included updates. :sweat: If anyone disagrees, let's revisit. It just seemed too annoying and confusing to make people delete and add.

pospi commented 1 year ago

Some realignment on terminology to update through the API outline in the initial issue, which I'll do momentarily.

image

pospi commented 1 year ago

A side-note to @LeosPrograms that there are some scripts in VF GraphQL which you might be able to steal from in order to reduce boilerplate (and not have to redeclare all your GraphQL types in TypeScript manually).

See graphql-codegen on NPM, this config file and this package script. Easy enough to configure & run before build, and then you can just commit it like a metadata file after altering the schema defs.

pospi commented 1 year ago

Initial spec now updated to match latest conversations. @fosterlynn please let us know if there still feels to be any discrepancy in our understanding of the problem space. (And apologies to @LeosPrograms for any unnecessary refactoring needed.)

One final question RE spec I have now is for reverse queries to load lists of assigned records. Do we need FacetValue.associatedValues() (paginated) and FacetValueAssignment.record (in this case it would be a union type of Agent | ResourceSpecification)?

fosterlynn commented 1 year ago

Do we need FacetValue.associatedValues() (paginated) and FacetValueAssignment.record (in this case it would be a union type of Agent | ResourceSpecification)?

No pagination needed for any of the facet data. There will never be very much data.

I don't know if you need a FacetValueAssignment.record union. I suppose if you are going to use just one class for the assignment, you will need that. [EDIT: But doesn't that sort of negate the point of putting the assignment into the facet zome?]

P.S. Please take a look at the model for all naming, although I just thought of one improvement, which I'll make soon there: categorizes instead of what you are calling record and I was calling categoryFor. If you don't like any of the naming, or it was not immediately clear to you, let's chat. Also re. naming, you are combining ResourceSpecificationFacetValue and AgentFacetValue into FacetValueAssigment, which seems like a good name if that relationship will reside with the facet data zome and want to be generic to any record type.

pospi commented 1 year ago

Possible GraphQL API revision to consider:

You could likely remove FacetValueAssignment and reference FacetValue directly if you included the associationId: ID (nullable) in FacetValue such that it only shows up when referenced in reverse from Organization.facets or ResourceSpecification.facets. Thus that field could serve as FacetValueAssignment.revisionId would, but without the additional record in the way.

pospi commented 1 year ago

I don't know if you need a FacetValueAssignment.record union. I suppose if you are going to use just one class for the assignment, you will need that. [EDIT: But doesn't that sort of negate the point of putting the assignment into the facet zome?]

@fosterlynn I think in this you're essentially saying that facets display is a priority ("show me all the declared Facets for this Organization"), facets querying is not in scope yet ("find me all the Organizations who are Farmers").

fosterlynn commented 1 year ago

@fosterlynn I think in this you're essentially saying that facets display is a priority ("show me all the declared Facets for this Organization"), facets querying is not in scope yet ("find me all the Organizations who are Farmers").

Facets querying is in scope from the map search. But @happysalada has said that he doesn't need to use the actual fields for that, the users can type any text from anywhere in the returned data and it will work. I'll let him explain that. On the other hand, if we are writing a general purpose facets module, we can consider allowing for the "find me all the Organizations who are Farmers" query. Let's discuss.

Then repeating from chat, here are the main ways the UI will want facets loaded are:

  1. One in the create/update modals, where people need to pick FacetValues for an Agent, or FacetValues for a ResourceSpecification. So for one of the FacetGroup objects (we'll need a filter there), get all the Facets, and for each of those Facets, get all the FacetValues.
  2. One for the display of a specific Agent or a ResourceSpecification, which will start with that object, get the FacetValues associated with that one object, and for each of those FacetValues, get the Facet it relates to.

The other one I can think of is for initial creation of the Facets and FacetValues, but 1. can work for that, to show what already exists. And that won't be used very often.

fosterlynn commented 1 year ago

Organization.facets or ResourceSpecification.facets

Do you mean Organization.facetValues and ResourceSpecification.facetValues?

And just to make sure, you are understanding that Organization to FacetValue is a M:M, right, irrespective of revisions?

happysalada commented 1 year ago

here is the search function (it might need a bit of an update, but it was working last year).

  function filter() {
    displayData = allData.filter(datum => matchObject(searchInput, datum))
  }
  function matchObject(searchInput: string, datum: any): boolean {
    let terms = searchInput.split(' ')
    return terms.every((term) => {
      return Object.values(datum).some((value) => matchValue(term, value))
    })
  } 
  function matchValue(term: string, value: any): boolean {
    if (typeof value === 'string') {
      return value.toLowerCase().includes(term)
    }
    if (typeof value === 'object') {
      return Object.values(value).some((inner) => matchValue(term, inner))
    }
    return false
  }

basically go through each object and do string matching. So as long as the agents have the facets included, then search should work. This "toy" search solution works under 1000 objects, after you might run into performance issues. For new york textile, I think it should be pretty good.

pospi commented 1 year ago

Do you mean Organization.facetValues and ResourceSpecification.facetValues?

I mean, in terms of the datatypes actually being linked I do (or FacetValueAssignment depending on this), but I thought a simpler facets was a more succinct type-level edge name than facetValues since they're probably most naturally referred to as "the facets assigned to this Organization"?

And just to make sure, you are understanding that Organization to FacetValue is a M:M, right, irrespective of revisions?

Correct.

fosterlynn commented 1 year ago

but I thought a simpler facets was a more succinct type-level edge name than facetValues since they're probably most naturally referred to as "the facets assigned to this Organization"?

I don't think so, I think we should say what we mean, and facets means something different, especially to people who are already familiar with faceted classification. But also to anyone who looks at the definitions or model, especially important for the devs who will need to understand how the model works to write queries.

fosterlynn commented 1 year ago

@fosterlynn please let us know if there still feels to be any discrepancy in our understanding of the problem space.

Getting to this request!

Stored Facet option data needs to be indexed for efficient retrieval by record_type.

Let's clean this up to use the terms we are using, I don't actually know what this means.

deleteFacetValue(ActionHash) to remove a previously allowed option for a Facet in case of error or change. We should not delete any previously stored FacetAssociations (below) so that UIs can decide whether to display the redacted reference on other records.

We shouldn't allow deletion of a FacetValue if there are stored FacetValueAssociations that reference it. The user can go remove or change the association first if needed. Also, we should get rid of the term "option" and say what it is in the model.

And I would prefer we use FacetValueAssociation, not FacetAssociation, which would mean something else. Also in the param names, get names, wherever it occurs. Actually, I see later on you did use FacetValueAssociation. Let's use that everywhere!

getAssociatedFacets(AssociationQueryParams { identifier, facet_ehs, facet_group_ehs }: AssociationQueryParams) -> Vec where facet_eh and facet_group_eh are both optional if neither is passed, load all associated FacetValues for the record identifier. otherwise, filter the results to only those with identifiers matching the passed vectors of facet_ehs and facet_group_ehs

I don't see that we need facet or facet group as parameters for any known use cases. Am I missing something? Best to keep it simpler, especially for this round of development, imo. Especially we don't need facet group, that will follow naturally, and a record will be only related to facet values part of one group.

Also note that some apps won't need FacetGroup at all, it should be optional in a generic facet model that can be used outside of Carbon Farm Network.

type FacetValue {
  id: ID!
  revisionId: ID!
  facet: Facet!
  value: String!
  note: String
}

On the above, value should be name. Also, it should include order.

type Facet {
  id: ID!
  revisionId: ID!
  group: FacetGroup!
  name: String!
  note: String
  values: [FacetValue!]!
}

The above should include order.

GraphQL zome API methods we could optionally implement: getAllFacetValues()

I don't think we want this.

type FacetValueAssignment {
  revisionId: ID!
  value: FacetValue!
}

This confuses me. Either the FacetValueAssignment needs to include the record reference or it is not a type, it is a property. Or is revisionId a reference to the record? If so, can we make that more self-documenting, or just document it with a comment in the code? To me it looks like the revisionId of the FacetValueAssignment. Also, is there an ID?

I'll add another comment later to continue my review, need a bit more time to think about all the reference names you are using.

fosterlynn commented 1 year ago

"find me all the Organizations who are Farmers"

Possibly OT, but just to make sure we're clear: "Farmer" is a role in the network, not done with facets. (In the non-hacked model, these roles involve 2 agents.)