Build Catalog Relationships for Jinwoo's Aging Dams Use Case

Castronova commented 2 years ago

Jinwoo's Aging Dams study is the focus of our 90-day prototype. Data associated with this study needs to be organized in HydroShare such that relationships are established between them once ingested in the catalog, e.g. HydroShare collection.

[x] Organize the aging dam datasets such that each type of file exists in its own HydroShare resource
[x] Created relationships between them in HydroShare (or using a Collection)
[x] Export metadata from HydroShare to the IGUIDE Catalog.
[ ] Build a GraphQL query that demonstrates how these data can be discovered by searching for a term, e.g. "Aging Dams".

Castronova commented 2 years ago

@sblack-usu As per our discussion today. Create HS resources for Jinwoo's aging dams study, place them in a collection, and explore relationships such as "executedBy". If you have time, put another GeoTiff in a cloud bucket and establish a relationship between it and the data in HydroShare.

sblack-usu commented 2 years ago

Collection Resource Raster Datasets Feature Datasets Jupyter Notebook

sblack-usu commented 2 years ago

List resources in the collection

{
  geojson_checksum_relations(
    query: {id: "a8174c56c6c449c4a0aa2a5d55e629d5", relations: {type: "hasPart"}}
  ) {
    title
    relations {
      type
      value
    }
  }
}

Castronova commented 2 years ago

@sblack-usu I've also been working on this using a similar query, and I believe this is almost exactly what we want!

What I'm attempting to do is query a dataset (e.g. I-GUIDE Aging Dam Risk Assessment) and retrieve both metadata about that dataset (title, abstract, url, authors, etc.) as well as properties of the related datasets (e.g. title, abstract, url). Right now I can get the resource ID which can be used in a subsequent query to gather this information, but is it possible to do this in a single command?

Would it be possible to change the schema for relations so that the referenced document (i.e. Dataset) is stored instead of the guid?

{
  "data": {
    "geojson_checksum_relation": {
      "abstract": "I-GUIDE Aging Dam Risk Assessment",
      "relations": [
        {
          "type": "hasPart",
          "value": "1f6b2d5a417a4122a8df5b06b8747792"
        },
        {
          "type": "hasPart",
          "value": "bd768f1f71014368a7943555e8b196e5"
        },
        {
          "type": "hasPart",
          "value": "b268e163dc3649cfad5e9c686d26e84a"
        }
      ],
      "subjects": [
        "I-GUIDE",
        "Aging Dam"
      ],
      "title": "I-GUIDE_Aging_Dam_Risk_Assessment"
    }
  }
}

What I'm trying to get is something like this:

{
  "data": {
    "geojson_checksum_relation": {
      "abstract": "I-GUIDE Aging Dam Risk Assessment",
      "relations": [
        {
          "datasets": [
            {
               "value": "1f6b2d5a417a4122a8df5b06b8747792",
               "title" : "some title",
                "description": "a description of the dataset",
                "url": "http://some-url.com"   
             },
            {
               "value": "aasassajasbdfkasdasdf5b06b8747792",
               "title" : "some title 2",
                "description": "a description of the dataset 2",
                "url": "http://some-url-2.com"   
             }
        ]
      }
      ],
      "subjects": [
        "I-GUIDE",
        "Aging Dam"
      ],
      "title": "I-GUIDE_Aging_Dam_Risk_Assessment"
    }
  }
}

FYI: This may or may not be helpful https://spin.atomicobject.com/2018/03/09/graphql-api-resolvers/

sblack-usu commented 2 years ago

Your link to resolvers looks like it is on the right track. A resolver is what @aaraney used for the fuzzy search across title/abstract. I can take a stab at this, it'd be good for me to be more comfortable with custom resolvers if we're embracing graphql.

Castronova commented 2 years ago

@sblack-usu Feel free to take a crack at it. We can also talk about this during our meeting tomorrow.

sblack-usu commented 2 years ago

Whew, feels good to know more about custom resolvers. Also, if we organize our schema better we can do this with relationships.

query {
  geojson_checksum_relation {
    abstract
    created
    id
    language
    modified
    published
    subjects
    title
    relations{
      relationsResolver {
        title
        abstract
        id
      }
    }
  }
}

Castronova commented 2 years ago

I was hacking something together earlier...I'm excited read your solution because I wasn't very confident in mine 😁.

Thanks for working on this @sblack-usu !

Castronova commented 2 years ago

marking this a complete since I was able to use your query to discover the aging dams datasets.

I-GUIDE / data-catalog-archive

Build Catalog Relationships for Jinwoo's Aging Dams Use Case #13