livebook-dev / livebook

Automate code & data workflows with interactive Elixir notebooks
https://livebook.dev
Apache License 2.0
4.98k stars 427 forks source link

Deterministic Anchor Tags #277

Closed Steven0351 closed 3 years ago

Steven0351 commented 3 years ago

Environment

Current behavior

If you load a Livebook from a URL, any anchor links provided to sections are invalid and navigation is unable to be performed. Without looking at the implementation, I'm assuming this is randomly generated when the Livebook is loaded.

Expected behavior

I would expect for sections anchors to be deterministically generated so that a Livebook imported from a URL could preserve the intended behavior of links using anchor tags.


I used Livebook to write the majority of a blog post. When I went to validate the anchor tags by importing the livebook from my github repo, the anchor tags were invalid for that session and no navigation occurs to different sections.

josevalim commented 3 years ago

Hi @Steven0351, thanks for the report! Are you using anchor tags for sections/headers or for cells? We can probably make them deterministic for headers based on the contents but harder for cells.

Steven0351 commented 3 years ago

For the sections/headers, specifically I'm just taking the anchor from whatever the link button here gives me and tossing it in a [link](#anchor) for reference elsewhere

livebook header with link icon showing

In my current session it resolves to #section-aicgx6gm45elm2llwjocofy7nttqj52g, but it changes each time it's imported.

jonatanklosko commented 3 years ago

Hey @Steven0351!

@josevalim

We can probably make them deterministic for headers based on the contents but harder for cells.

Yeah, we can dynamically generate the element id and anchor link based on the content, there's no reason to couple it to the generated id. We could generate anchors for cells based on their position within section, but that's too brittle as adding/moving a cell would invalidate references.

cheerfulstoic commented 3 years ago

I'll have a go at this. I've read the notes about not using the position to avoid brittleness, so I'll see what I can do 😄

josevalim commented 3 years ago

Thanks @cheerfulstoic! The main idea is to generate it based on the section title itself. Similar to what GitHub does. Things to watch out for:

  1. We need to remove any character that is not valid in ID/anchor tab. For example, what is the anchor for a title of ## Hello José=!?:? You can see what GitHub does as a reference.

  2. There is also the issue with duplicate section titles. I will discuss with @jonatanklosko how we want to handle those and post another comment soon.

josevalim commented 3 years ago

If we have duplicate section titles, then we need to add suffixes to later sections. For example, if we have two sections titled "Example", one needs to have "example" as anchor and the other is "example-2". We can compute the section anchors inside the data_view computation (https://github.com/elixir-nx/livebook/blob/00edceafc9026ba1490b5032ccd3070dd57465b9/lib/livebook_web/live/session_live.ex#L724).

Eiji7 commented 3 years ago

@josevalim (…) We need to remove any character that is not valid in ID/anchor tab. For example, what is the anchor for a title of ## Hello José=!?:? You can see what GitHub does as a reference. (…)

Here is specification for HTML 4:

ID and NAME tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons (":"), and periods (".").

Source: https://www.w3.org/TR/html4/types.html#type-id

and here is specification for HTML 5:

3.2.3.1 The id attribute

The id attribute specifies its element's unique identifier (ID). [DOMCORE]

The value must be unique amongst all the IDs in the element's home subtree and must contain at least one character. The value must not contain any space characters.

Source: https://www.w3.org/TR/html5-author/global-attributes.html#the-id-attribute

Which one we would follow?

Here is a regular expressions for id validation:

# HTML 4
~r/^[A-Za-z]+[\w\-\:\.]*$/

# HTML 5
~r/^[^\s]+$/
cheerfulstoic commented 3 years ago

Some things that I've noticed from my looking around / testing:

I'm still looking, but I'm wondering if it's maybe a timing thing with LiveView itself. Like, maybe it tries to find the element / scroll before the whole page is loaded. Maybe this is something that doesn't happen until you get to a certain sized LiveView app? Just a guess for now

josevalim commented 3 years ago

@Steven0351 I believe the issue is that the IDs change once you kill the server and open up the notebook again. So you can't say: "import this and then access this anchor tag". :) Also note that the goal is to fix this only for sections.

Steven0351 commented 3 years ago

@Steven0351 I believe the issue is that the IDs change once you kill the server and open up the notebook again. So you can't say: "import this and then access this anchor tag". :) Also note that the goal is to fix this only for sections.

That does appear to be the behavior, I never noticed this issue until I tried the import feature and all my section links were broken.

cheerfulstoic commented 3 years ago

Ah, right, that makes things much simpler. I read that before, but I didn't process it right :)

So it's easy enough to make anchor labels for sections / enumerate ones that are the same and I'll work on that 😄

But even if you restart the server or import to a new server, you'll have a different session ID for that notebook, so the links would still be broken (if I understand correctly). But one step at a time (and having the URL hashes be more readable is also a worthly goal)

jonatanklosko commented 3 years ago

But even if you restart the server or import to a new server, you'll have a different session ID for that notebook, so the links would still be broken

Yeah, but that's expected :) The use case is for adding references to other sections within the same notebook, so you could have a markdown cell like:

See [this section](#awesome-section)