quarto-dev / quarto-cli

Open-source scientific and technical publishing system built on Pandoc.
https://quarto.org
Other
3.61k stars 294 forks source link

Fix external broken links in documentation broken #9005

Closed jasonpott closed 4 months ago

jasonpott commented 4 months ago

What would you like to do?

Report an issue on quarto.org

Description

The hyperlink to hypothesis documentation is broken. https://quarto.org/docs/reference/projects/websites.html#hypothesis

Screenshot 2024-03-07 at 18 29 15

Screenshot 2024-03-07 at 18 30 00

cderv commented 4 months ago

Thanks for the report.

Hypothesis changed their website it seems. This is now at https://h.readthedocs.io/projects/client/en/latest/publishers/config.html

mcanouil commented 4 months ago

See on a related subject (internal links):

mcanouil commented 4 months ago

The current external broken links.

Update on 2024-03-12.

Getting links from: index.html
└─BROKEN─ http://conda.pydata.org/docs/install/quick.html (HTTP_404) -> True

Getting links from: docs/computations/ojs.html
├─BROKEN─ https://typst.app/docs/reference/meta/bibliography/ (HTTP_404) -> True

Getting links from: docs/authoring/cross-references-divs.html
├─BROKEN─ https://tablesgenerator.com/markdown_tables (BLC_UNKNOWN) -> False
├─BROKEN─ https://www.tablesgenerator.com/text_tables (BLC_UNKNOWN) -> False

Getting links from: docs/authoring/diagrams.html
├─BROKEN─ https://isni.org/ (HTTP_403) -> False

Getting links from: docs/authoring/title-blocks.html
├─BROKEN─ https://rajgoel.github.io/reveal.js-demos/fullscreen-demo.html (HTTP_404) -> True

Getting links from: docs/presentations/revealjs/themes.html
└─BROKEN─ https://example.com/mysite/ (HTTP_404) -> True but expected (probably worth doing something anyway)

Getting links from: docs/websites/website-blog.html
├─BROKEN─ https://www.algolia.com/api-keys (HTTP_404) -> True

Getting links from: docs/websites/website-tools.html
└─BROKEN─ https://support.google.com/analytics/answer/2763052?hl=en (HTTP_404) -> False

Getting links from: docs/websites/website-about.html
├─BROKEN─ https://observablehq.com/@d3/zoomable-sunburst%3E (HTTP_404) -> True

Getting links from: docs/interactive/ojs/data-sources.html
└─BROKEN─ https://quartopub.com/profile (HTTP_500) -> True but expected (probably worth doing something anyway)

Getting links from: docs/publishing/github-pages.html
└─BROKEN─ https://deno.land/manual/standard_library (HTTP_404) -> True

Getting links from: docs/projects/virtual-environments.html
└─BROKEN─ https://www.w3.org/publishing/epub3/epub-packages.html#bib-typesregistry (HTTP_404) -> True

Getting links from: docs/reference/formats/presentations/revealjs.html
└─BROKEN─ https://wwwimages.adobe.com/content/dam/acom/en/devnet/indesign/sdk/cs6/idml/idml-specification.pdf (HTTP_404) -> True

Getting links from: docs/reference/formats/tei.html
├─BROKEN─ https://h.readthedocs.io/projects/client/en/latest/publishers/config/ (HTTP_404) -> True
└─BROKEN─ https://deno.land/std@0.125.0/datetime (HTTP_404) -> False

Getting links from: docs/reference/projects/books.html
├─BROKEN─ https://h.readthedocs.io/projects/client/en/latest/publishers/config/ (HTTP_404) -> True
├─BROKEN─ https://deno.land/std@0.125.0/datetime (HTTP_404) -> False

Getting links from: docs/reference/projects/manuscripts.html
├─BROKEN─ https://community.chocolatey.org/packages/rsvg-convert (HTTP_404) -> False
mcanouil commented 4 months ago

The following broken links is in Quarto CLI side: https://www.w3.org/publishing/epub3/epub-packages.html#bib-typesregistry

BROKEN─ https://www.w3.org/publishing/epub3/epub-packages.html#bib-typesregistry (HTTP_404)

The best replacement I have found is: https://www.w3.org/publishing/epub32/epub-packages.html#bib-typesregistry but not quite sure it is correct (it's outdated).

Any suggestions?

mcanouil commented 4 months ago

The two links for ICML have no replacement. The link was already hidden has it appears only in a forum about the ICML specification not very publicly available. The link was never a very robust one it appears.

BROKEN─ https://wwwimages.adobe.com/content/dam/acom/en/devnet/indesign/sdk/cs6/idml/idml-specification.pdf (HTTP_404)

Any suggestions?

mcanouil commented 4 months ago

Source: https://github.com/quarto-dev/quarto-web/blob/main/docs/presentations/revealjs/advanced.qmd#L578

BROKEN─ https://rajgoel.github.io/reveal.js-demos/fullscreen-demo.html (HTTP_404)
mcanouil commented 4 months ago
Getting links from: https://quarto.org/docs/dashboards/examples/index.html
https://github.com/jjallaire/ojs-penguins-dashboard/blob/main/penguins.qmd
-> Very likely a "private repository" => Awaiting feedback about visibility change of the repository.

Getting links from: https://quarto.org/docs/reference/formats/epub.html
https://www.w3.org/publishing/epub3/epub-packages.html#bib-typesregistry
-> outdated with Epub 3.3 => No obvious replacement found.

Getting links from: https://quarto.org/docs/reference/formats/icml.html
https://wwwimages.adobe.com/content/dam/acom/en/devnet/indesign/sdk/cs6/idml/idml-specification.pdf
-> very hidden link initially found on a forum => No replacement found.

For the last two, I have no idea on how or with what to fix the links.

cderv commented 4 months ago

Getting links from: https://quarto.org/docs/reference/formats/epub.html https://www.w3.org/publishing/epub3/epub-packages.html#bib-typesregistry -> outdated with Epub 3.3 => No obvious replacement found.

The new link is this one: https://www.w3.org/publishing/epub32/epub-packages.html#bib-typesregistry This points to Registry link : https://idpf.github.io/epub-registries/types/ that we could also use

However, the question to update the description could be dealt with. I don't know if we (through Pandoc) are producing Epub 3.3 compatible output (or still 3.2). Equivalent for 3.3 is https://www.w3.org/TR/epub/#sec-opf-dctype and it says

The dc:type element [dcterms] is used to indicate that the EPUB publication is of a specialized type (e.g., annotations or a dictionary packaged in EPUB format).

EPUB creators MAY use any text string as a value. NOTE The former IDPF EPUB 3 Working Group maintained a non-normative registry of specialized EPUB publication types for use with this element. This Working Group no longer maintains the registry and does not anticipate developing new specialized publication types.

So if 3.3 is what we output, we could just adapt the description to this new one (but there is probably others - so it would be its own update issue I guess).

For now, you can probably put the new link: https://www.w3.org/publishing/epub32/epub-packages.html#bib-typesregistry

cderv commented 4 months ago

Getting links from: https://quarto.org/docs/reference/formats/icml.html https://wwwimages.adobe.com/content/dam/acom/en/devnet/indesign/sdk/cs6/idml/idml-specification.pdf -> very hidden link initially found on a forum => No replacement found.

Those pages can be created based on Pandoc information. So usually, you can try find information there too. So for replacement see

New link would be https://manualzz.com/doc/9627253/adobe-indesign-cs6-idml-cookbook but honestly it is not the best - probably legit, but not official. It has been removed from the web I think.

This comes from https://community.adobe.com/t5/indesign-discussions/where-is-the-idml-specification/m-p/13172633

We could also not provide link - I don't think this will be quite often read.

mcanouil commented 4 months ago

Regarding Epub, that's the "outdated" link I've found but don't know if it makes sense as you pointed out (https://github.com/quarto-dev/quarto-cli/issues/9005#issuecomment-2002134988). Anyhow, the link is in Quarto CLI side: https://github.com/quarto-dev/quarto-web/blob/main/docs/reference/formats/epub.json#L120

For Adobe InDesign, there are no official or reliable source, the issue of the link being broken will be a continuous concern while I don't see what the link brings in Quarto. I would be in favour of removing it.

Possible links:

cderv commented 4 months ago

This is the current equivalent link: https://www.w3.org/publishing/epub32/epub-packages.html#bib-typesregistry (epub3 replaced by epub32). This would mean no other change for now - just fixing the broken link.

the issue of the link being broken will be a continuous concern

We could do like Pandoc and use the other one, or just remove the link. The latter is probably fine.

Possible links:

Is IDML equivalent to ICML ?

mcanouil commented 4 months ago

I believe ICML is a typo (made several times by the authors of Pandoc and issues) as IDML stands for InDesign Markup Language (file extension is .idml amongst others).

ICML is the file type/extension for InCopy (Markup Language).

Content can be migrated between InDesign and InCopy. Here it is very ambiguous on Pandoc side. image

cderv commented 4 months ago

ICML is the file type/extension for InCopy (Markup Language).

Oh so it makes sense that --to icml is used but they should talk about InDesign IDML. Did not know that.

Anyhow, no need to spend more time on that. Just remove the link as you are suggesting.

Thanks a lot for the explanation !

mcanouil commented 4 months ago

The only remaining issue is:

Getting links from: https://quarto.org/docs/dashboards/examples/index.html
https://github.com/jjallaire/ojs-penguins-dashboard/blob/main/penguins.qmd
-> Very likely a "private repository" => Awaiting feedback about visibility change of the repository.