relaton / relaton

The Relaton bibliographic information project
BSD 2-Clause "Simplified" License
9 stars 19 forks source link

Consider generating GitHub Pages to show what data is available in this repo #110

Open ronaldtse opened 1 year ago

ronaldtse commented 1 year ago

@andrew2net : @CAMOBAP can help apply this workflow to all relaton-data-* repositories.

andrew2net commented 1 year ago

@ronaldtse the most work here is to create indexes in all relaton-data-* repositories. Not every repository has an index yet. And I started thinking about relaton-index gem. The reasons are:

The index gem should:

As we have a same formated index in every relato-data-* repository, we can easely generate index pages using same Jekyll template. Any thoughts?

ronaldtse commented 1 year ago

@andrew2net I agree that the very first thing is to have the same index structures across all Relaton repositories.

An index is really a subset of the Relaton XML/YAML information of a Relaton collection, in a form that can be easily accessed and searched.

Remember that the Metanorma site generate command creates the mini site based on RXL (Relaton XML) information? This is exactly like the purpose of the Relaton index site -- look at standards.calconnect.org.

We should extract that code and consolidate such code into relaton-index.

andrew2net commented 1 year ago

@ronaldtse the GH pages don't allow Jekyll to run any code except Jekyll's core functionality (for security reasons). So we just can put index.yaml into Jekyll's _data folder and use fields from the index file in a template to generate an HTML index page. It seems Jekyll has pagination in its core functionality, but searching through all pages isn't available in the core functionality.

ronaldtse commented 1 year ago

@andrew2net that's not entirely correct. GitHub Actions can run Jekyll in full, and all you need is to run the https://github.com/actions/deploy-pages actions to upload a ZIP file with index.html.

ronaldtse commented 1 year ago

Have a look at https://github.com/actions-mn/build-and-publish/blob/main/action.yml

andrew2net commented 1 year ago

@ronaldtse I apologize for the incorrect statement. I have tried to create a custom plugin to generate an index.html directly from documents in the data folder, but I was unable to make it work. It highly likely happened because I used a github-pages plugin instead of jekyll gem. It said the plugin doesn't allow any plugins except the list. If we can create a custom plugin and run it with GHA then we don't need to have index.yaml to create index.html. We have documents in the data folder. Every document has an ID and title. So the custom plugin can just read each document and generate the index.html. I'm sure this approach is better than using the index.yaml for the following reasons:

ronaldtse commented 1 year ago

@andrew2net there's no need to apologize... yes I think that is what happened.

If we can create a custom plugin and run it with GHA then we don't need to have index.yaml to create index.html. We have documents in the data folder.

If we do not want to run a Jekyll site in full, we can actually just generate a new *.adoc file per bibliographic object (generated through YAML) in addition to having the YAML in _data/. We can also publish the index.yaml directly because ultimately we need a machine-readable form.

UPDATE: I just noticed that https://pages.github.com/versions/ does not have jekyll-asciidoc. Maybe this will help? https://stackoverflow.com/questions/53215356/jekyll-how-to-use-custom-plugins-with-github-pages

For index.yaml, we will need to provide "identifiable elements" to allow location of a desired resource.

For example, elements like

are used to locate a resource.

andrew2net commented 1 year ago

We can also publish the index.yaml directly because ultimately we need a machine-readable form.

@ronaldtse do we need pages for users or machine-readable? I supposed we need list of references for users. The indexes are machine readable itself. In case we need reference lists for users, I think it would be better to generate JSON file with references and titles and implements Javascript powered page with pagination and search. Do you agree?

ronaldtse commented 1 year ago

@ronaldtse do we need pages for users or machine-readable? I supposed we need list of references for users. The indexes are machine readable itself. In case we need reference lists for users, I think it would be better to generate JSON file with references and titles and implements Javascript powered page with pagination and search. Do you agree?

For both.

Agree to use JS powered pages for pagination, but we also want static pages to allow Google to index.

andrew2net commented 1 year ago

@ronaldtse check please https://relaton.github.io/relaton-data-oasis/ If it's ok then I suggest to make a GH action repo relation-index-page and use the action in all relaton-data-* repos.

ronaldtse commented 1 year ago

@andrew2net thank you for this!

Can we make the page more compact (less spacing) and provide the full doc id for citing, i.e. "Use DocID OASIS genericode-v1.0 to refer to this item"?

Screenshot 2023-06-24 at 10 18 53 AM
CAMOBAP commented 1 year ago

@ronaldtse check please https://relaton.github.io/relaton-data-oasis/ If it's ok then I suggest to make a GH action repo relation-index-page and use the action in all relaton-data-* repos.

Technically we don't need a separate repo we can keep it in support

andrew2net commented 1 year ago

Technically we don't need a separate repo we can keep it in support

@CAMOBAP could you implement the action in support? If yes, let's make an issue in support.

CAMOBAP commented 1 year ago

@andrew2net to be on the same page _config.yml will be different for each data repository or we can keep it on support repository?

andrew2net commented 1 year ago

@CAMOBAP if it possible to have different _config.yml in each data repository, let's do it. I thought about ENV variables, but I think it's better to have different configs. These configs will have different title, description, all the settings in the jekyll-index plugin settings, maybe pagination. And keep in mind, the site generator is under development now, so it will be updated.

ronaldtse commented 1 year ago

@andrew2net @CAMOBAP when can we move forward with this?

CAMOBAP commented 1 year ago

@ronaldtse ball on my side, I will try my best to finish it soon

CAMOBAP commented 1 year ago

@ronaldtse @andrew2net few questions from my side:

ronaldtse commented 1 year ago

@CAMOBAP for the repo name I prefer the latter.

Every Relaton data set should have an index that is named the same way. I believe @andrew2net needs to implemented such a shared functionality across the Relaton gems.

andrew2net commented 1 year ago
  • AFAIK not all repos have index.yaml file, will we apply the deploy workflow only for ones that have index.yaml?

@CAMOBAP what repos do you mean?

CAMOBAP commented 1 year ago
  • AFAIK not all repos have index.yaml file, will we apply the deploy workflow only for ones that have index.yaml?

@CAMOBAP what repos do you mean?

Should we apply this shared workflow to some other data repositories except oasis

P.S. https://github.com/relaton/relaton-data-oasis/pull/6 - ready for review

andrew2net commented 1 year ago

Should we apply this shared workflow to some other data repositories except oasis

@CAMOBAP Yes, we need to apply this to all relaton/relaton-data-* repos

CAMOBAP commented 1 year ago

Should we apply this shared workflow to some other data repositories except oasis

@CAMOBAP Yes, we need to apply this to all relaton/relaton-data-* repos

Got it, I propose to merge https://github.com/relaton/relaton-data-oasis/pull/6 first

BTW does all relaton/relaton-data-* contains index.yaml or similar?

andrew2net commented 1 year ago

Yes, every relaton/relaton-data-* repository has an index file, but file names can vary. The jekyll-index plugin has repository-specific settings. The setting source contains a name of an index.

ronaldtse commented 1 year ago

@andrew2net can we use a single index file name across all flavors? We also need to document the index file specification. There will be people who want to maintain their own Relaton data set. Thanks.

andrew2net commented 1 year ago

@ronaldtse we use versioned index filenames. In case updating an index structure, the previous gem version will stop working, so we need to keep the earlier version of the index for a while. Initially, I used filenames like index.yaml, index-bipm.yaml. Then I moved to index-v1.yaml, index-v2.yaml, etc.

andrew2net commented 1 year ago

We also need to document the index file specification.

@ronaldtse Some indexes have a unique structure because instead of IDs they use parts of IDs. I think finally we'll implement pubid-* gems for all the relaton-* gems, so all the indexes will use IDs' parts and will have a unique structure. That means if we need an index file spec, we have to create such a spec for each relaton-data-* repo.