Closed planetf1 closed 2 years ago
I agree and there is work in progress and on the back log to make this a reality since we need to bring in new collaborators.
The definition of "interesting metadata" is different depending on the end user.
Egeria already supports open metadata archives for both types and instances. This is the mechanism slted to use to load content packs and sample metadata. For example, the data governance project has defined content packs for data privacy that will be implemented as open metadata archives.
The archives to load can be specified at start up. I would like to add an API to be able to load an archive into a running server since that would be useful for people experimenting
Without a UI, any experimenter is probably a developer. David and I have been working on samples that tie up with the scenarios from the data governance besdt practices. For the APIs we have done so far, the samples show the API being used to create metadata. The one I am working on at the moment is for connected asset OMAS. This is a read only API so it will use an archive to pre-populate information about assets and then call the API to view the details.
The assets in the archive will have classifications on their schema so you can use them to write a sample for governance engine omas.
Thanks. Those examples would be useful across all our APIs.
For deployment purposes the reach of the sample data will need to go behond the egeria APIs and also encompass related components like ldap, gaian, postgres, ranger using capabilities like k8s config maps, init pods, stateful sets to correctly initialize & sequence . I will focus on that in the deployment project for our next round of demos using what we have so far and look for opportunities to extend the sample data as more becomes available.
@mandy-chessell @planetf1 could you point me to the samples that show how to create the metadata or link explaining the open metadata archives ?
Our new tutorials being built via in odpi/egeria#1297 should really help with this Over to mandy for more comments
This would be a bit of a hack, I think, but we could seed an environment simply by publishing the payloads here into a cohort:
As I understand it, any samples would have these same payload structures, so I could also spend some time cleaning these up a bit to create a minimal standard set of sample metadata that fits into the Coco Pharma landscape (?)
Interesting idea, though perhaps would be nicer to use the archive format to load in metadata? And to use other OSS components as actual data servers etc
My understanding was that the archive format used the same payload structures as everything else -- so creating such an archive would just be a matter of pulling out the payloads we want to retain from the set above, per my suggestion (?)
The open metadata archive format is similar (ie the typedefs and instance structures are the same) however the header of an archive has its own format.
There also needs to be some thought to the metadata collection id used inthe archive - is it the identifer of an active repository or an archive or a deregistered repository - this will determine if the elements are updateable.
The real work is creating metadata where someone could learn something from their experimentation - they need a purpose and guidance as to which of the 400+ REST operations they should use.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions.
This has been open for quite a while now, and we continue to have periodic questions along these lines on Slack. Maybe I should take this on to create at least some starting point from which we can continue to expand?
It seems important to adoption 😬
Do we still need to keep this open. Unclear what specific action we might take on this
Little recent activity or specific action to take, so closing.
Tutorials continue to evolve, both in notebook and via other dojos.
Currently if someone wants to take a look at Egeria, they can have a look at the code, and of course run the code. They can also read detailed information on the design of the metadata model. And finally we do now have some sample data
However with an empty metadata repo they would have to figure out how to populate different kinds of metadata to learn more about how the APIs work. This makes it much harder to get started. This question has come up twice in the last week or so!
So if we could create a quick-start script that loads 'interesting' metadata (based on OCO Pharma) into egeria it would help users explore the API, and understand egeria. More complex scenario demos would potentially use this - and more data, but to be clear the intent here is something simple for egeria only.
Given we only have an in-memory repository native to egeria, for now this would require a postman script, shell scripts, java etc to load the data -- or perhaps better we could use the 'archive' mechanism which is already used for types.
I think this will help adoption & development.