Projects/collections endpoints and pages

frafra commented 3 years ago

We are generating collection pages at the moment, but we would like to fetch the description of the collection from the dataset page. The datasets contain a list of tags: some of these tags are collection-tags (this is a workaround, as GBIF does not support collections). It could be possible to determine if a tag is a collection-tag by checking if it is contained in a short fixed tag list.

In order to achieve that, we tried to generate 2 files from the same markdown page: one full HTML page and one JSON file containing the rendered content and the front matter, by relying on the https://github.com/jekyll/jekyll-redirect-from plugin, as suggested in a post https://github.com/jekyll/jekyll/issues/3041#issuecomment-547340684, but our resulting files were empty (see the jekyll-redirect-wip branch).

Another approach would be to write a custom Jekyll generator, but we are not familiar with the Ruby language.

If we are not able to get this using Jekyll, we could rely on a small custom API service containing the HTML content with the front matter inside a JSON object.

MortenHofft commented 3 years ago

I might be misunderstanding, so bear with me if I'm stating the obvious:

There is only one dataset page and it is just javascript. So Jekyll do not know what the rendered content is. The content is rendered by the browser. If you want the dataset pages to be pre-generated then you probably need to write a custom Jekyll plugin. Something similar to https://github.com/contentful/jekyll-contentful-data-import

frafra commented 3 years ago

Thank you, Morten, for the quick reply :)

We are not looking to pre-generate the dataset page, but to fetch extra information by fetching a Jekyll-generted JSON file. It is a bit of a mix, and we are also open to explore other solutions :)

We will add a tab in the dataset page containing information about the project/collection the dataset belongs to, by doing a request via JavaScript and fetch such details. We were thinking to produce such response by generating a static JSON file via Jekyll. So it would be a mix :)

MortenHofft commented 3 years ago

I'm available for a call if you want to discuss this on skype/zoom. It might be easier.

What I think of as the project description is already available in the dataset API response. E.g. https://api.gbif.org/v1/dataset/c3413793-cd8e-4f74-b0b7-e1f0c155102c

"project": {
  "title": "Improving Biodiversity data accessibility in the Caribbean countries of Trinidad & Tobago, Barbados and Suriname",
  "identifier": "BID-CA2016-0006-REG",
  "contacts": [...],
  "funding": "This project is made possible through a BID project by GBIF, financed by the European Union.",
  "studyAreaDescription": "This project covers Introduced fauna in Suriname observed, collected or mentioned in literature from approximately 1960 till present.",
  "designDescription": "The goal of this project is to make taxonomic records of the NZCS and biogeographic and ecological research executed in Suriname accessible to all.",
  "abstract": "The purpose of this project is to make the various collections at the main institutions housing the biodiversity information available digitally through publication and the possible creation of a CariBIF portal for the region (Atlas of Living Caribbean). The Introduced Fauna Suriname Checklist was especially created to supply data for the Atlas of Living Caribbean."
}

MortenHofft commented 3 years ago

Ahhh - I just now understand (I think) what you want. I've tried to rephrase it below - is this correct please:

You have your own definition of what a project is. Let us call it a group. And you add tags to the datasets to match them to a group. And then you have your group (called a collection page) that describes the group with some metadata and lists the associated datasets. And on the individual dataset pages you want to fetch that information in some format and show the "parent" group metadata.

right?

MortenHofft commented 3 years ago

We might be doing something similar. E.g. https://www.gbif.org/project/83239/improving-biodiversity-data-accessibility-in-trinidad-and-tobago-barbados-and-suriname

We have a CMS where we describe the prose and specify the projectId. We use that projectId to display the associated datasets. For the individual datasets, the project description comes from the dataset EML. And we just show a link back to the CMS controlled project page. So it is possible (and necessary) for the individual datasets to describe how they fit into the project. But we could just as well have pulled it from the CMS forcing all description to be the same.

MortenHofft commented 3 years ago

I'm certain you could write a generator to do this or one might already exist, but you could also do something like:

Define a new jekyll collection similar to the posts. Let us call it _projects. those pages will be generated as html. But you now have access to them in liquid and can generate one big json file containing all the project descriptions. incl the rendered html and front matter.

It is less ideal as the file contains all the projects as one endpoint, but it might do?

I created a branch to illustrate the idea.

siwelisabeth commented 3 years ago

Thank you very much @MortenHofft We will use this solution for project descriptions and other information we want to enrich our pages with :-)

gbif / hp-living-norway

Projects/collections endpoints and pages #11