GenomicsStandardsConsortium / mixs

Minimum Information about any (X) Sequence” (MIxS) specification
https://w3id.org/mixs
Creative Commons Zero v1.0 Universal
38 stars 21 forks source link

fix broken links for autogenerated documentation pages #530

Open turbomam opened 1 year ago

turbomam commented 1 year ago

Background

Prefix/URL definitions in the schema

MIxS asserts it's own prefix as 'MIXS' in terms.yaml and provides the expansion of https://w3id.org/mixs/terms/

prefixes:
  linkml: https://w3id.org/linkml/
  mixs.vocab: https://w3id.org/mixs/vocab/
  MIXS: https://w3id.org/mixs/terms/

w3id mixs namespace

The 'mixs' namespace has been reserved in the w3id system

Options +FollowSymLinks
RewriteEngine on

# vocab/ end point
RewriteRule ^vocab$ https://genomicsstandardsconsortium.github.io/mixs/$1
RewriteRule ^vocab\/(.*)$ https://genomicsstandardsconsortium.github.io/mixs/$1 [R=301,L]

---snip---

# Schema elements use Github Pages
# Rewrite Base URL
RewriteRule ^(.*)$ https://genomicsstandardsconsortium.github.io/mixs/$1 [R=302,L]

Documentation build process overview:

  1. Makefile rule generated/docs/index.md uses LinkML gen-doc to convert the schema into Markdown pages. Those are saved locally in the generated/docs path. They are not pushed to GitHub, so you won't see them in the web interface.
  2. generated/docs/introduction/%.md is invoked when generated/docs/introduction/background.md is requested. It copies some static Markdown files from static_md/ into generated/docs/
  3. mkdocs_html/index.html uses mkdocs to convert the Markdown files to HTML. Those aren't pushed either.
  4. gh_docs deploys the HTML files to GitHub pages.

Syntax of schema documentation web site URLs

The path to the MIxS GitHub repo is https://github.com/GenomicsStandardsConsortium/mixs. The path to the GitHub pages site is https://genomicsstandardsconsortium.github.io/mixs/, as far as GitHub is concerned. GitHub isn't aware that we are using w3id redirection to map http://w3id.org/mixs to https://genomicsstandardsconsortium.github.io/mixs/.

When we mention our schema documentation website to the public, we should always use the shorter http://w3id.org/mixs URL.

mkdocs.yml asserts that the base URL of each autogenerated documentation pages should be the GitHub pages path mentioned above, https://genomicsstandardsconsortium.github.io/mixs/

Numerical term identifiers as used as the right-hand portion of each term page's URLs. Numerical identifiers haven't been assigned to the EnvironmentalPackages, Checklists and CombinationClasses in the current release, so those pages end in the textual name of those elements.

Numerical identifiers have been assigned to EnvironmentalPackages, Checklists and CombinationClasses in the schemasheets branch, but that branch still require work before it can become the next release.

Manually maintained URL assertions in the GitHub repo

The About section in the upper right of the MIxS repo's home page recommends this documentation link: https://genomicsstandardsconsortium.github.io/mixs/. That should probably be replaced with a w3id-based link.


See also MIxS issues labeled w3id and documentation:

turbomam commented 1 year ago

actions:

turbomam commented 1 year ago

The homepage for the MIxS autogenerated documentation can be reached through https://w3id.org/mixs or https://genomicsstandardsconsortium.github.io/mixs/

One can return to the home page by clicking the "person reading a book" button in the upper left, or the Reference link on most or all other pages

examples of reported broken links:

turbomam commented 1 year ago

related problems:

turbomam commented 1 year ago

Might be able to solve the over-sized search index file by reading https://squidfunk.github.io/mkdocs-material/setup/setting-up-site-search/

turbomam commented 1 year ago

I have eliminated the use of the mixs.vocab prefix

If it's really important to restore it, I would ask someone else to

turbomam commented 1 year ago

Note that I made edits directly to the YAML files in model/schema, so they would be overwritten if anybody ran gsctools/mixs_converter.py again. I presume it would be possible to edit that script to generate YAML files like those included in this PR, but I think finishing the scheasheets implementation is better investment, That will allow for declaring the prefixes and their expansions outside code.

turbomam commented 1 year ago

Note: some of the broken example URLs above included the .md extension. Did that ever work? It doesn't now, and that makes sense because .md files aren't/never were being served.

https://genomicsstandardsconsortium.github.io/mixs/Soil/ is the correct path to learn about the MIxS Soil package/extension. When a user requests that page, the server knows to return the index.html page within that path, i.e. https://genomicsstandardsconsortium.github.io/mixs/Soil/index.html

turbomam commented 1 year ago

@JKoehorst's

accurately points out that https://genomicsstandardsconsortium.github.io/mixs/terms/ChecklistClass does not resolve.

That's because there is no ChecklistClass in the release model

ChecklistClass is defined in the schemasheets branch but that's going to require some work before it can replace the current release. I can provide assistance with the technical implementation, but I don't think I should be making the judgement calls regarding things like

ramonawalls commented 1 year ago

@turbomam Thank you so much for providing such extensive background and explanations of this issue. I am very grateful.

I would like to suggest (and work on, with your help) a few changes regarding the namespace.

The decision to include "gensc" in the namespace was decided over several discussions, and documented in issue #233. The reason to include it is that GSC has other projects besides MIxS (e.g., genomic observatories). Even though those projects aren't to the point of us publishing documentation for them yet, we want to allow for the possibility. The plan was for the MIxS namespace to be https://w3id.org/gensc/mixs/. I think that should be easy enough to do through w3id.

  • 'terms' shouldn't be a part of the URL

Per issue #233, we had decided that all GSC terms, whether part of MIxS or not, would be under the same namespace. I am open to reconsidering this choice if it causes problems with using LinkML. We have no imminent plans to release any terms outside MIxS, so it should be fine to just put all mixs terms under the https://w3id.org/gensc/mixs/ namespace.

The homepage for the MIxS autogenerated documentation can be reached through https://w3id.org/mixs or https://genomicsstandardsconsortium.github.io/mixs/

This brings up another issue, which is the need to change our github organization name. GSC is spelled wrong (it should be "genomic" not "genomics". I have been unable to change it in the past, but I think I can get Volker to make the change. While we are changing it, I would like to suggest that we change the org name to "gensc" to be consistent with our website and easier to type. I will create a separate issue.

turbomam commented 1 year ago

OK,I apologize for jumping to conclusions.

I can help with most of these tasks if technical requirements are written out.

@cmungall do prefer for these additional changes to be made before of after your upcoming presentations?

turbomam commented 1 year ago

@ramonawalls' requests above are all doable, but have to be implemented in at least three places, all in coordination (and preferably with an understanding of the timing)

Org name

w3id prefix and redirection rules

Options +FollowSymLinks
RewriteEngine on

<!-- gensc/mixs ednpoint route -->
RewriteRule ^mixs$ https://genomicsstandardsconsortium.github.io/mixs/$1 [R=302,L]

<!-- TODO: gensc/terms endpoint -->

<!-- Rewrite Base URL -->
RewriteRule ^(.*)$ https://gensc.org/$1 [R=302,L]

I'm not great with htaccess, but I recommend these instead:

Options +FollowSymLinks
RewriteEngine on

RewriteRule ^gensc/mixs/(.*)$ https://genomicsstandardsconsortium.github.io/mixs/$1/ [R=301,L]
RewriteRule ^gensc(.*)$ https://gensc.org/$1 [R=301,L]

LinkML prefix expansion

If we decided to go with the rewrite rules above, then the MIXS prefix should be expanded in LinkML as

prefixes:
  MIXS: https://w3id.org/gensc/mixs/terms/
genomicstandardsconsortium commented 1 year ago

I have updated the owner/manager roles. Lynn


From: Mark A. Miller @.> Sent: Monday, February 13, 2023 10:35 AM To: GenomicsStandardsConsortium/mixs @.> Cc: genomicstandardsconsortium @.>; Mention @.> Subject: Re: [GenomicsStandardsConsortium/mixs] fix broken links for autogenerated documentation pages (Issue #530)

Org name

w3id prefix and redirection rules

Options +FollowSymLinks RewriteEngine on

RewriteRule ^mixs$ https://genomicsstandardsconsortium.github.io/mixs/$1 [R=302,L]

RewriteRule ^(.*)$ https://gensc.org/$1 [R=302,L]

— Reply to this email directly, view it on GitHubhttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FGenomicsStandardsConsortium%2Fmixs%2Fissues%2F530%23issuecomment-1428151323&data=05%7C01%7Clschriml%40som.umaryland.edu%7C91701c4b0a5c4e38aa4b08db0dd7ef24%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638118993306297529%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=jk42OhoTO6AwysPjDKUHkKLenYfqb3XhXDNQPmMtr4s%3D&reserved=0, or unsubscribehttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAG6AETZT5RQXO6IXYSLMSB3WXJID3ANCNFSM6AAAAAAUWZPXLM&data=05%7C01%7Clschriml%40som.umaryland.edu%7C91701c4b0a5c4e38aa4b08db0dd7ef24%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638118993306453766%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=p0G7lnULu7%2BthxJcAccScalltq%2FonGaEFVvokznxv0A%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.***>

turbomam commented 1 year ago

Thanks @Lynn. Can you say some more about your decision to remove @cmungall as an owner?

lschriml commented 1 year ago

Hello Mark, yes, only GSC board members should be owners. Lynn