tdwg / website-migration-2022

Website migration 2022
0 stars 0 forks source link

Preserving heading-linked URLs #1

Closed gkampmeier closed 1 year ago

gkampmeier commented 1 year ago

The current site (in Pelican) is coded to create automatic table of contents and links to those sections and subsections by headings designations in Markdown (e.g., https://www.tdwg.org/conferences/2022/instructions-for-abstract-submission/ and each H2 and H3 heading generates its own link that can be referenced to help users pinpoint something in a longer document). This particular page's internal links expire in importance, but ones such as https://www.tdwg.org/conferences/2022/session-list/ are used (however ugly) in external sites (e.g., TDWG's YouTube channel) and need to persist or judicious forwarding provided so they don't break (or they need to be fixed if we can determine where they are used).

ben-norton commented 1 year ago

Does GitHub allow htaccess files for github websites? There are ways to assure url persistence that will render the issue obsolete.

MattBlissett commented 1 year ago

No, there's very little configuration for a Github-hosted site.

Note www.tdwg.org is built using GBIF's Jenkins server, and deployed onto a GBIF server running Apache. It's already using redirects for the standards documents, so Github-hosting probably isn't an option.

https://github.com/tdwg/infrastructure/blob/master/httpd/www.tdwg.org.conf

https://github.com/tdwg/infrastructure/blob/master/httpd/static.tdwg.org.conf (4.6GB of data on this one)

ben-norton commented 1 year ago

Thanks Matt, That’s what I thought, but it was worth asking.

From: Matt Blissett @.> Sent: Friday, November 11, 2022 1:39 PM To: tdwg/website-jekyll @.> Cc: Norton, Ben @.>; Comment @.> Subject: [External] Re: [tdwg/website-jekyll] Preserving heading-linked URLs (Issue #1)

CAUTION: External email. Do not click links or open attachments unless you verify. Send all suspicious email as an attachment to Report @.***>

No, there's very little configuration for a Github-hosted site.

Note www.tdwg.orghttps://urldefense.com/v3/__http:/www.tdwg.org__;!!HYmSToo!bMYjUb5Pdk4FbpqLdWCJD8X6kCQKxDUvY6p2l1K509mReW057oslf3wvO_peOypdTl3E4epBzgMsJCUao5T1d-YN1jjuPrw$ is built using GBIF's Jenkins server, and deployed onto a GBIF server running Apache. It's already using redirects for the standards documents, so Github-hosting probably isn't an option.

https://github.com/tdwg/infrastructure/blob/master/httpd/www.tdwg.org.confhttps://urldefense.com/v3/__https:/github.com/tdwg/infrastructure/blob/master/httpd/www.tdwg.org.conf__;!!HYmSToo!bMYjUb5Pdk4FbpqLdWCJD8X6kCQKxDUvY6p2l1K509mReW057oslf3wvO_peOypdTl3E4epBzgMsJCUao5T1d-YNbHRk4EM$

https://github.com/tdwg/infrastructure/blob/master/httpd/static.tdwg.org.confhttps://urldefense.com/v3/__https:/github.com/tdwg/infrastructure/blob/master/httpd/static.tdwg.org.conf__;!!HYmSToo!bMYjUb5Pdk4FbpqLdWCJD8X6kCQKxDUvY6p2l1K509mReW057oslf3wvO_peOypdTl3E4epBzgMsJCUao5T1d-YN-ndOw8Q$ (4.6GB of data on this one)

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/tdwg/website-jekyll/issues/1*issuecomment-1312060395__;Iw!!HYmSToo!bMYjUb5Pdk4FbpqLdWCJD8X6kCQKxDUvY6p2l1K509mReW057oslf3wvO_peOypdTl3E4epBzgMsJCUao5T1d-YNo7gQ3cY$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/ABX55QETORSL6QCJSOEJXALWH2HFHANCNFSM6AAAAAAR5WDAWE__;!!HYmSToo!bMYjUb5Pdk4FbpqLdWCJD8X6kCQKxDUvY6p2l1K509mReW057oslf3wvO_peOypdTl3E4epBzgMsJCUao5T1d-YNQFggf6M$. You are receiving this because you commented.Message ID: @.**@.>>

peterdesmet commented 1 year ago

Just like on the current website, the table of contents plugin that I'm using in Jekyll is automatically assigning anchors to headers. These links are less ugly that the current ones, e.g.:

#sym02-how-are-biodiversity-infrastructures-stepping-up-to-address-the-biodiversity-crisis
vs
sym02%20how%20are%20biodiversity%20infrastructures%20stepping%20up%20to%20address%20the%20biodiversity%20crisis?

So anchors on new pages should look better. However, current (ugly) anchors can be preserved, by hardcoding it to the (old ugly) one:

{: id="sym02%20how%20are%20biodiversity%20infrastructures%20stepping%20up%20to%20address%20the%20biodiversity%20crisis?"}
## SYM02 How are biodiversity infrastructures stepping up to address the biodiversity crisis?

It would be a bit of work, but certainly doable.

@gkampmeier if you can point me to the pages where you know those deep anchors are used, that would be useful.

gkampmeier commented 1 year ago

https://www.tdwg.org/conferences/2022/instructions-for-abstract-submission/#incomplete%20submission the link does not look deeper than any other one, but it is an H3 heading under https://www.tdwg.org/conferences/2022/instructions-for-abstract-submission/#finalize%20and%20submit%20to%20journal, an H2 heading. (you may need to click on the attached image to see the entire thing).
Screen Shot 2022-11-12 at 1 34 06 PM This is currently automatically rendered (not coded in Markdown), although it may be part of a header file for the website (?).

peterdesmet commented 1 year ago

@gkampmeier I'm not sure I understand your comment.

To clarify myself:

  1. The Jekyll site will (just like the current Pelican site) automatically assign anchors to all headings. Those anchors are going to be less ugly than the current ones
  2. We can preserve existing anchors that are currently being used externally (like from YouTube), but it requires some manual work.
  3. To limit the amount of work we have to do for 2, it would be good to have an overview of pages which anchors are used externally. The session list) is one, do you know of any others?
gkampmeier commented 1 year ago

@peterdesmet all the pages that currently have H2/H3 headings that consist of more than one word have had the potential of the "%20" uglies when used as links. How many of these are used in places where people will be tripped up? This is difficult to anticipate.

Is there a site-wide solution where you can at least get users to the base URL (e.g., https://www.tdwg.org/conferences/2022/session-list/ and not have it 404 at https://www.tdwg.org/conferences/2022/session-list/#sym01%20linking%20worldwide%20plant%20data%20-%20world%20flora%20online,%20wfo%20plant%20list,%20ipni,%20and%20beyond? It doesn't take them to the exact place on the page, but it doesn't leave them totally high and dry.

peterdesmet commented 1 year ago

Oh, no worries, even with an incorrect anchor, users will still get to the right page (no extra settings for that).

i just wanted to make sure that for pages where anchors are used a lot, we get them to the right place as well. So far, it seems that is only needed for the 2022 sessions page?

gkampmeier commented 1 year ago

@peterdesmet likely an issue for anything put on YouTube in 2020, 2021, and 2022. This may include working sessions. We're just starting to work on the Working Session videos (not yet transferred to YouTube) for 2022, and I am working with @baskaufs about content for the description. We will likely try to keep it simple for now.

Another place is the about/membership page and perhaps other about-linked pages.

peterdesmet commented 1 year ago

Anchor links are preserved for conference session pages. I noticed some anchor links used in the about pages, but have fixed those. It is possible I missed some, but users will at least be directed to the right page. Closing issue.