Open etj opened 1 year ago
Some comments from another issue referencing this topic:
https://github.com/GeoNode/geonode/issues/11494#issuecomment-1723335481 by @giohappy
@gannebamm at the moment the idea is just to implement the simple relationship model. We can extend it to include additional properties in the future, but this will require also extending the editor. This is something I wouldn't do until the metadata editor has not been migrated to the new client. I have almost completed the design of a proposal for the new metadata editor, but no timeline at the moment since.
https://github.com/GeoNode/geonode/issues/11494#issuecomment-1725079088 by @gannebamm
Thünen is currently looking at ISO-19115 metadata fields not yet covered in the GeoNode metadata schema. We will need to extend the wizard. Also, #10290 // #10367 will require changes in the wizard. If possible, we could try to work together on the new metadata editor, and you could get some feedback on its developer experience. From our side, for example, @ahmdthr and @ridoo are part of the development team. What do you think?
Thanks for the proposal @gannebamm . I just wanted to add some of my thoughts on this topic to the discussion. First of all I want to give some impression on the current interaction with metadata inside of geonode. As geonode ist a project that is growing for long time it has a long history of internal APIs. In the current implementation the metadata values are accessed through Django methods like Querrysets
, and Forms
and since some time also by the django-rest-api
including their serializers
and Viewsets
. Therefore when adding new fields to the metadata model the following parts need editing:
This different methods also require different validation strategies which require some maintenance when editing the metadata model.
Therefore, as we now have a more or less robust rest API, I would suggest to only use the API for interaction with the metadata model. Additionally the base.forms and templates could be removed, and a lot of code needs to be edited to make metadata interaction generally API based. But afterwards, there would be place for a new metadata editor only using rest v2 to interact with geonode django-rest-api backend. This would also allow wider possibilities for geonode metadata editors and would make space for geonode project allow different templates for the metadata editor like (slim, medium, ISO19115:2003).
Still if somebody requires more or other fields for metadata, she need to change the database model, rest API and UI and something not mentioned yet, the pyCSW interaction. For the pyCSW interaction I go definitely with the issue https://github.com/GeoNode/geonode/issues/9147 which would make it easier to bring changes to metadata model to the CSW API.
@gannebamm @mwallschlaeger The high level design I'm planning for the rewrite of the metadata editor is the following:
As you will notice, this configuration is very similar to what we have for the new filters and facets API.
This configuration could be initially serialized (JSON) and served to the client statically. It could be overridden inside a project, similar to the geocode config overrides. A second step could be to make the configuration build dynamically and define an interface for modules to register additional fieldsets, fields, and field configurations dynamically.
The goal is to:
Ok, this is a very rough description. I wanted to take more time to write it down, but given the WIP GNIP has been created already I think it was useful to share it right away.
@giohappy thanks for sharing this idea with us. I was spinning my head around this for some time the recent days. If i understand you correct this is only related to the UI part, so the JSON would not configure any API endpoints or database columns behind the API right?
I think generally your idea could be a good approach to make it easier to edit metadata editor in geonode. But for some projects i was working on in the past on which I applied the same approach, I ended up with blowing and extending the JSON definition more and more because the definitions are not fine grained enough. And when I think about my latest experiences with the widgets and forms in geonode I guess it's going to be very hard to implement all the aspects of this into a simple JSON model. I still like the idea and approach to make it more easy to edit the metadata editor view.
Just a quick mention of some work done by @t-book in that regard: https://github.com/csgis/custom_metadata
This would also mean implementing the metadata editor with React as well, right? If so, I am rather skeptical in terms of complexity. ( Based on experiences with the development with mapstore2 ). But I would wait for the final GNIP.
@t-book What other alternatives would be maintainable? One of the goals of @giohappy is
make configurations, APIs, and the client for the metadata editor converge on what we already do for filters, avoiding duplicated models and configuration schemas where possible
Which implies using mapstore2 as the framework.
Even though I also see mapstore2 developing as a daunting and complex task. However, I am no front-end developer; therefore, this is heavily biased and likely due to inexperience.
@giohappy as far I can see, the API you have in mind would provide client instructions how to render widgets based on metadata which is might be configurable on the backend side.
Do you have any reasons to include such a coupling between REST API and client? When it comes to metadata, I would have expected the client to make sense of the metadata for its own scope (a client does not always render things).
You already have an OpenAPI description available. Did you accounted this already, and why not to extend metadata this way? Even JSON schema might be an option (but I have to admit not to thought about this option deeper, yet).
If "coupling" gets necessary, maybe configurable layouting on the client side can be an option.
Do you have any reasons to include such a coupling between REST API and client? When it comes to metadata, I would have expected the client to make sense of the metadata for its own scope (a client does not always render things).
It depends on what you mean by coupling. My proposal is to have a way to tell the client what we want to edit, and how to lay it out (given a list of available widgets/components). This configuration must be dynamic because we want projects to extend and/or modify it. Or even, in the future, create custom forms beyond the metadata editor.
Where should these configurations stay? How do you give it to the client? An API is what I would use. I'm not expecting to implement such an API in GeoNode itself but in the geonode-mapstore-client
app. This app has already a lot of "coupling" with the client, being the app that renders the client. At the same time, it decouples the client from GeoNode's core.
You already have an OpenAPI description available. Did you accounted this already, and why not to extend metadata this way? Even JSON schema might be an option (but I have to admit not to thought about this option deeper, yet).
Here we're talking of something different. We want a model to describe forms, I don't see how this can be covered by the current API.
This would also mean implementing the metadata editor with React as well, right? If so, I am rather skeptical in terms of complexity. ( Based on experiences with the development with mapstore2 ). But I would wait for the final GNIP.
React doesn't imply MapStore2. We can see if MapStore2's routing and state management can help or not here, but we're not assuming that we MUST use MapStore2. On the other hand, we promote React2 to avoid using vanilla Javascript for such an important and complex component and to harmonize the client components. Of course, this can be discussed with the people that will work on this. What we really care is avoid Frankesteins :)
We're still fighting with the plethora of approaches in the legacy templates. Parts written with Angular, others with jQuery + dozens of unmaintained libraries.
Where should these configurations stay? How do you give it to the client? An API is what I would use. I'm not expecting to implement such an API in GeoNode itself but in the
geonode-mapstore-client
app. This app has already a lot of "coupling" with the client, being the app that renders the client. At the same time, it decouples the client from GeoNode's core.
Seems, that I thought you wanted to add this to the GeoNode API (/api/v2
). In general, I agree, that geonode-mapstore-client has lot of coupling with the UI .. it actually provides the UI (with some GeoNode integration). Here, such API would be better located than in the GeoNode API of course.
However, until now I have thought of the backend implementation of geonode-mapstore-client is a temporary approach on the path to completely separate frontend from backend. Adding REST API functionality in the contrib module here, feels a bit like cheating :smile:, but I may not see all the parts playing a role here. Do you have more strategic insights how geonode-mapstore-client will evolve (a link would also be fine)?
@t-book
This would also mean implementing the metadata editor with React as well, right? If so, I am rather skeptical in terms of complexity. ( Based on experiences with the development with mapstore2 ).
@giohappy
React doesn't imply MapStore2. We can see if MapStore2's routing and state management can help or not here, but we're not assuming that we MUST use MapStore2. On the other hand, we promote React2 to avoid using vanilla Javascript for such an important and complex component and to harmonize the client components. Of course, this can be discussed with the people that will work on this. What we really care is avoid Frankesteins :)
@giohappy I completely understand the need to converge on one main frontend framework and to get rid of old loose ties (like angular stuff). Would you @t-book think that a 'plain' React-based metadata editor would be easier to extend/develop for, or is this still to much overhead in your opinion?
@giohappy
Where should these configurations stay? How do you give it to the client? An API is what I would use. I'm not expecting to implement such an API in GeoNode itself but in the geonode-mapstore-client app. This app has already a lot of "coupling" with the client, being the app that renders the client. At the same time, it decouples the client from GeoNode's core.
@ridoo
[...] Here, such API would be better located than in the GeoNode API of course.
However, until now I have thought of the backend implementation of geonode-mapstore-client is a temporary approach on the path to completely separate frontend from backend. Adding REST API functionality in the contrib module here, feels a bit like cheating 😄, but I may not see all the parts playing a role here.
What is the worst case in using the geonode-mapstore-client django app to enable that API? I understand your resentments and that thus API would put pressure on the geonode-mapstore-client application and, therefore, make it harder to completely decouple GeoNode core from its UI components. On the other hand you could use the geonode-mapstore-client django app as example to eg build a geonode-vue-client app, right?
What is the worst case in using the geonode-mapstore-client django app to enable that API? I understand your resentments and that thus API would put pressure on the geonode-mapstore-client application and, therefore, make it harder to completely decouple GeoNode core from its UI components. On the other hand you could use the geonode-mapstore-client django app as example to eg build a geonode-vue-client app, right?
I would not speak from "worst case" here. As said, my impression was there will be two separate things in the end (of a path I cannot oversee currently):
At the moment the geonode-mapstore-client is a django module shipping client side code. To be honest, I did not dig deep enough into it to reason about why this is actually necessary. Ok, there is some route mapping, but also (which led me think the current status is an intermediary solution) still a lot of Django templates which then include the JS.
However, I cannot rate about decisions, in particular because I can't oversee the big picture. To me, however, a completely decoupled client makes a good testing candidate for the headless backend: "Is it possible to create a client based on the REST API available". Of course this may make it harder to realize things, especially with regard to the issue we actually try to discuss here. In any sense, it puts focus on alternate, but generic solutions, i.e. in this case separating "what" (backend) from "how" (frontend).
This however, leaves me with the question, why the client not just uses the metadata to render forms and widgets? Actually, the data is described already by its schema (more or less), isn't it? Perhaps, I do miss something really important here :see_no_evil:
EDIT: Keep in mind, I do not bring such a big Django dev history.
@giohappy I forgot to reply on this one
You already have an OpenAPI description available. Did you accounted this already, and why not to extend metadata this way? Even JSON schema might be an option (but I have to admit not to thought about this option deeper, yet).
Here we're talking of something different. We want a model to describe forms, I don't see how this can be covered by the current API.
What I mean is this: there is an OpenAPI description available (which takes quite long for whatever reason) .. however, dependent how this gets generated you may think of a client which takes those metadata descriptions to render widgets, right? For example a dataset is described here:
"Dataset": {
"type": "object",
"description": "DREST-compatible model-based serializer.",
"properties": {
"pk": {
"type": "string",
"readOnly": true
},
"uuid": {
"type": "string",
"readOnly": true
},
"name": {
"type": "string",
...
}
Of course, the schema needs more love and would have to be improved here. But a proper OpenAPI schema would be a benefit for more use cases than a custom API within the geonode-mapstore-client IMHO.
There are also libraries which support you rendering things from json schema (combined with templating).
Where should these configurations stay? How do you give it to the client? An API is what I would use. I'm not expecting to implement such an API in GeoNode itself but in the
geonode-mapstore-client
app. This app has already a lot of "coupling" with the client, being the app that renders the client. At the same time, it decouples the client from GeoNode's core.
It seems to me rendering the component in React from the the configuration it gets from the mapstore-client isn't going to help with the coupling with the core - instead it would either mean maintataining the correspondance between the GeoNode core metadata models and the mapstore-client configurations for React rendering or alternatively providing at least some endpoint in the GeoNode API that would expose the necessary information on the metadata model (which is then used in the mapstore-client to make a configuration palatable for React).
The former apporoach seems like a recipe for errors when metadata models get updated or extended and in the later case why not just expose the metadata schema in a way that would be directly easily palatable and sufficient for rendering?
I naively think it could be done in a way that it's not just hard-coded JSONs but is derived from the models (perhaps also reusing the code for serialization?). And it seems like both a custom API like @giohappy suggested and an approach with a standardized schema description @ridoo mentions would work, just with different trade-offs. But I might be just making many wrong assumtions, please correct me if I am thinking in the wrong direction.
Also, I am not sure I understand the general direction here, please help me out. Is Django app meant to stay within the mapstore client? What functions will it perform that are currently done by Django in the GeoNode core? What is its current scope of responsibilities?
And when I think about my latest experiences with the widgets and forms in geonode I guess it's going to be very hard to implement all the aspects of this into a simple JSON model. I still like the idea and approach to make it more easy to edit the metadata editor view.
@mwallschlaeger what do you think are going to be the the hard aspects? Are there particular parts of the metadata models you're worried about?
@kilichenko-pixida will create a simple POC using react-jsonschema-form to test its capabilities and how it could interact with the OpenAPI schema. This should help to follow up on this long discussion thread and have some implementation tests to criticise (good or bad).
@giohappy As @gannebamm mentioned above, we've worked on a POC for the approach @ridoo suggested.
To recap, the idea is to use an existing React library to render the form to edit the metadata based on the schema, and, as much as possible, to reuse the automatically generated OpenAPI schema we already have. Here is the stand-alone React draft implementation.
The key relevant findings so far:
The problems with the currently generated schema are manifold: 4.1. The generation is slow - it takes around 5s in the dev environment. This is probably solvable with some form of caching but it hasn't been tried yet. 4.2. The drf_spectacular library used for the introspection into the model and generating the schema seems to currently be poorly integrated. It produces multiple runtime errors (see the log in the linked repo) and as a result, many fields are missing from the schema or are represented incorrectly. It can be fixed manually, but the amount of both upfront and maintenance effort required might defeat the purpose of relying on automatic schema generation. 4.3. There is a bidirectional mismatch between what is marked as readOnly in the API (in the serializers and therefore in the schema) and what is actually editable through the advanced metadata editor.
For a more comprehensive overview of the findings, including further considerations about the schema, please see a prior lengthier post on the matter.
Given the points above, at least several key questions come up:
A) What is the general direction of the OpenAPI schema development? Is it meant to be looked into any time soon? What purpose should it eventually serve?
B) Is the poor integration of the drf_spectacular something that's already known about? Are we missing something important about it?
Thanks @kilichenko-pixida for summarizing your findings.
I just wanted to emphasize why we evaluated this path:
Other advantages on that approach would hopefully:
It took some time to read everything and connect the many comments :)
In general, I agree with the approach of basing the form client component / app (whatever it is) on the OpenAPI schema. Theoretically, it's the best and cleanest approach, although with a list of shortcomings that I will list later. Problems are:
So, yes, basing the forms on the OpenAPI would be nice, but in reality, we don't have it now. This doesn't mean we are against its development. I think we SHOULD have a correct and extensive OpenAPI for GeoNode
Basing the form only on the OpenAPI has its limitations by the way.
S, we don't have a properly generated OpenAPI yet, and we're not sure that we can express with it everything that is needed for a complex form.
This is why, in the end, we have considered to have a dedicated API. Yes, there will be some duplication, and we must take care of keeping this API aligned with the resource models, but this gives us complete freedom and flexibility to arrange it for the specific purposes of a form, also in terms of performance and dynamicity.
If any of you have a solution for the OpenAPI approach, and a concrete implementation to support the form requirements, I'm fully open to it!
FYI we're moving this activity forward. Soon we will submit a tech proposal for the refactoring of the metadata services and frontend (editor).
We are keeping into account the proposal to make the metadata editor leverage on the OpenAPI and/or the JSON Schema specifications, but we're finding cases that fall outside the scope of these schemas, that would require custom schemas and protocols in any case. For example, the most important thing that isn't covered well (at least from my findings) is telling how to retrieve dynamic values / enums for selectors. For example, how do you tell the client where and how to retrieve the list of users for selecting owner, poc, etc.?
We don't want to fight against the specification or have to build complex schemas to circumvent any limitations. For this reason, we'll push forward the idea of using an API dedicated to the description of the metadata schema, along with a new API for their management (CRUD ops).
Another pillar for refactoring is building the metadata management API on top of multiple (configurable) model fields (concert or virtual). We want to detach the metadata structure from the Django models and the DB and let any model contribute its fields to the metadata, even extending them for custom purposes.
The goal is to reduce gain flexibility and reduce the number of places where customizations are required to implement custom metadata models (at the moment it means integrating your own models with the Django models through signaling, override the many metadata templates, etc.).
And finally, this goes hand in hand with the upgrade to PyCSW 3, which will probably be part of this activity.
Hi @giohappy Nice to see there is some ongoing work here! We had some discussions about the whole concept of metadata in GeoNode. One of the major issues we discovered is that metadata entries for publications (in terms of research) are static and must be static by design. The dynamic approach of GeoNode which saves user-objects to contacts is therefore against that principle. Here is an example to make this more clear:
Scientist sci-A
works currently for organisation unit org-A
. If sci-A
uploads dataset data-A
he/she will be listed as owner with his/her other contact infos like affiliation to org-A
. Later sci-A
switches to a new organisation unit like org-B
. Due to the dynamic nature of GeoNodes metadata it will result in changed metadata if retrieved in the future for data-A
.
Even though this is the prefered way to e.g. contact sci-A
, this shall not happen from our point of view. It would for example change the list of datasets from org-A
even though data-A
was created while sci-A
was part of it.
I hope this makes our issue clear?
We are aware of the edge-case-iness from our research institutes point of view. We are also aware that our edge-case will not change how the broader community will develop this feature and may think about other ways to solve our issue. One briefly drafted idea was to use the new #12124 to save static metadata files per dataset.
@mwallschlaeger what about ZALF? Have you also discussed this issue?
Hi @gannebamm. I understand your point, The dynamic nature of many metadata fields, and their relations with other model fields, works well in many cases, but far less in others. Yours is an example, but we also have clients that were forced to use the "Metadata upload preserver" option to keep the original uploaded metadata file, and to avoid it being regenerated by GeoNode.
I don't think there's a one-fits-all solution. There could be more advanced options to opt-out from dynamic fields, have metadata versioning, and so on, but all this goes beyond the plans for this refactoring.
We have started investigating the various options, including react-jsonschema-form. We have also looked at the PoC from @kilichenko-pixida.
This is our current plan:
react-jsonschema-form
covers all our needs, including the most specific one: telling the client where and how to retrieve the values for a field (external or internal API endpoint)We will keep you updated about our experiments....
GNIP - New metadata editor
Overview
The metadata editor still relies on a legacy template engine. It should be migrated to use newer libraries
Proposed By