Passing metadata around

NickHu commented 5 years ago

Today @jamievicary and I had a discussion about how we're passing state around. Conceptually, a project has data associated to it:

mathematical data: the proof object, signature, workspace etc.
metadata: title, abstract etc.

We agreed that it's a good idea to keep these separate when, for example, saving to the database. In this way, we can do things like query for projects with abstracts containing a certain keyword, and other querying operations in general against the metadata.

Currently, this metadata isn't represented in the state of the application. Once it is, we need to think about how to pass it around. At the moment, the mathematical data is serialised and compressed and then passed around as a URL fragment. One possible thing that we can do is serialise the metadata too and also pass it around as part of the URL.

This has the benefit that when you give someone a URL, they can paste it into their web browser and get all of the project data, including the title and abstract etc.

I'm skeptical about leaning too heavily on this state-only-exists-in-the-URL model. To summarise,

URLs are expected to be very long - it's not very typical and perhaps not very robust to rely on browser-specific behaviour regarding megabyte-sized URLs;
I feel like the 'correct' way to do this is to just pass around a JSON object, with a field for title, author, content (the mathematical data hash).

When considering this, we ought to be mindful of the following workflows:

normal use of homotopy.io;
people who don't want to store any data on our servers at all;
offline work - say, you've loaded up homotopy.io and you're boarding a flight, and you want to do some work on the plane, and you've already downloaded the data for a bunch of projects that you might want to look at.

NickHu commented 5 years ago

Jamie had some valid concerns regarding usability:

base64 encoding removes friction from having Unicode around, which pdflatex doesn't play nicely with by default --- this means that you can dump a URL in the tex source of a paper you're working on, making it a hyperlink, and not have to worry about it (albeit the actual URL might be very long so it would probably be hidden);
novice users won't be tempted to edit a base64-encoded URL fragment and break encoding-related things.

Personally, I am of the opinion that if we do go the route of putting the metadata in a URL fragment (which I am skeptical of), we shouldn't base64 encode and it should be visible (in a %-style encoded form, similar to Google Maps) to the user.

My preference is to use the JSON file export/import structure in the application, rather than trying to be very clever about encoding everything into the URL.

NickHu commented 5 years ago

Here's an idea of how it might look to use plain JSON instead encoding everything into a giant URL.

hyperref, the LaTeX package for URLs, is able to create PDF forms with a submit button which can issue a POST request. As we're not going to render the URL in the PDF anyway, due to length, we can in theory still accomplish the same one-click PDF hyperlink to open a proof stored all locally. Here's a mock up of what I'm talking about: in the PDF

So as you can see, my amazing proof does XYZ. Here is the proof object (really just JSON):

    +------------------------------------+
    | title: My proof,                   |
    | abstract: An amazing proof,        |
    | content: <base64 encoded stuff>    |
    |------------------------------------+
    | Click here to open in homotopy.io  |
    +------------------------------------+

Now, clicking the 'button' would leverage hyperref to send a POST request to homotopy.io and load up the project all from data sent with the request. Alternatively, if the user doesn't want to use homotopy.io or is offline, then they could copy-paste the JSON into their own local instance via the import functionality. We could hide or replace the top box with a data URI, also with hyperref, if we want to avoid rendering the content string.

jamievicary commented 5 years ago

Hi Nick, thanks for raising this discussion here. I'm happy to have the metadata in the hash string in URL-encoded form.

jamievicary commented 5 years ago

Regarding this point about a POST request, yes, that's a possibility, and it would certainly be useful for the rare cases where the proofs are too long for a hash string.

Using this as the only mechanism for PDF-embedded proofs would be a bad idea as again it is nontrivial, requiring people to go to the considerable lengths of finding out about this highly non-obvious proof embedding method. The point about the current URL encoding plan is that it is brain-dead obvious: even a novice user will realize within 5 seconds that the state of their proof is encoded in the URL, and it will then be completely obvious to them how to embed that in their research paper, making it much more likely that they actually do this. Most users will lack the time, or motivation, or sophistication, or diligence, to read any documentation at all, let alone read the details of a complex proof embedding method.

homotopy-io / homotopy-webclient

Passing metadata around #59