c2corg / v6_ui

UI for c2c.org v6
GNU Affero General Public License v3.0
7 stars 12 forks source link

Parsing C2C custom tags in documents text attributes #131

Closed asaunier closed 7 years ago

asaunier commented 8 years ago

In v5 a log of formatting is possible and based upon several syntaxes, mainly BBcode and Markdown. For the latter syntaxes a PR is available at #127

Many custom tags are also available:

See the documentations about those syntaxe:

asaunier commented 8 years ago

wikilinks => see #156

asaunier commented 8 years ago

Images

First thing is to figure out weither to an extension for the bbcode parser or for the markdown parser.

Extending the bbcode parser looks fairly easy, see the doc/example at http://bbcode.readthedocs.org/en/latest/formatters.html

On the other hand the markdown parser seems a bit more flexible and powerful (and in the future we might want to get rid of the bbocode!?) => then it's probably smarter to only write markdown extensions?

There's a quite annoying issue with images: the img tags contains the document_id, not the filename itself. For instance [img=188119 right]Lever de soleil[/img]. => we will have to make a query to the database (ie. to the API) to get the actual filename.

stef74 commented 8 years ago

Liste of non supported syntax

fjacon commented 8 years ago

Highest priority : images

asaunier commented 8 years ago

The tricky thing with images is that the tag contains the image id but not the real filename (eg. [img=188119 right]Lever de soleil[/img]).

Perhaps a solution could be to convert the tag to some angular directive that makes an AJAX request to the API to get the image filename based upon the document id => embedded images would then be loaded asynchronously.

Or we could have a kind of proxy (webservice) that would convert an image-id-based image URL to the real image URL.

tsauerwein commented 8 years ago

Perhaps a solution could be to convert the tag to some angular directive that makes an AJAX request to the API to get the image filename based upon the document id => embedded images would then be loaded asynchronously.

Aren't these tags parsed on the UI server-side?

asaunier commented 8 years ago

Aren't these tags parsed on the UI server-side?

Not yet but they could. I'm just worrying that too many API requests on the UI server-side might delay the page rendering too much!?

asaunier commented 8 years ago

Liste of non supported syntax

This list should be considered in several parts.

Concerning tables, please note that http://www.camptocamp.org/articles/151910/fr/aide-mise-en-forme-du-texte#column describes a bbcode-like [col] which is obviously not Markdown-supported. On the other hand, the L# formatting seems to rely on the markdown table system. It's a bit a pity to use several redundant syntax system. Perhaps we should drop the [col] tag support?

Concerning images, the tricky thing is too convert the image id into an actual file URL (the filename changes if a new version of the image is uploaded whereas the image id remains the same). The filename is not available directly in the tag itself. See https://github.com/c2corg/v6_ui/issues/131#issuecomment-229337659 Several approaches are possible for the conversion: on the UI server side or on the UI client side with an asynchronous request. For both cases we probably need to make a request to the API to get the filename according to an image id (and a version id?). Additional info such as the image title + the width/size would be needed as well.

asaunier commented 8 years ago

Concerning tables, please note that http://www.camptocamp.org/articles/151910/fr/aide-mise-en-forme-du-texte#column describes a bbcode-like [col] which is obviously not Markdown-supported. On the other hand, the L# formatting seems to rely on the markdown table system. It's a bit a pity to use several redundant syntax system. Perhaps we should drop the [col] tag support?

After having a closer look at this tag, it seems it is not really a table syntax. Those tags are actually replaced by <div> and <p> tags with predefined CSS classes. So the processing could probably be done using a filter such as https://github.com/c2corg/v6_ui/blob/master/c2corg_ui/format/wikilinks.py

tsauerwein commented 8 years ago

Several approaches are possible for the conversion: on the UI server side or on the UI client side with an asynchronous request.

Making the request on the UI client side is probably better. The request to get the page from the UI server side returns faster, and the user could already see the page, while the image info is requested in the background.

asaunier commented 8 years ago

+1 The fact that making client side conversion would help loading the page faster is important.

Please note however that:

asaunier commented 8 years ago

Concerning images, the tricky thing is too convert the image id into an actual file URL (the filename changes if a new version of the image is uploaded whereas the image id remains the same). The filename is not available directly in the tag itself. See #131 (comment) Several approaches are possible for the conversion: on the UI server side or on the UI client side with an asynchronous request. For both cases we probably need to make a request to the API to get the filename according to an image id (and a version id?). Additional info such as the image title + the width/size would be needed as well.

A possibility would be to make a simple conversion from

[img=37130 right]Gramusset[/img]

to

<img src="//api.camptocamp.org/images/proxy/37130" class="right-image" alt="Gramusset">

api.camptocamp.org/images/proxy/37130 would be a proxy service in the API that would make a request to the database to get the image filename and would redirect to the actual URL of the image file.

This way:

@tsauerwein do you think it could work?

tsauerwein commented 8 years ago

@tsauerwein do you think it could work?

That would work.

I just had another idea: We could also replace the image tags directly in the API when returning the document. For example when requesting /wayponts/1, the API would replace [img=37130 right]Gramusset[/img] with [img filename=/.../xyz.jpg right]Gramusset[/img]. This way we would avoid a couple of round-trips.

Requesting a document for editing would still return [img=37130 right]Gramusset[/img]. This might be a problem for the preview. For this we could still use the proxy redirect technique that you proposed. But the most common case (showing a document) would be optimized.

asaunier commented 8 years ago

That sounds interesting.

A drawback of this method is that some of the tag parsing (at least the id>filename conversion) is deported to the API instead of having all the parsing done at a single place (in the UI). In addition there's this consistency issue between the read/preview/write.

fbunoz commented 8 years ago

A drawback of this method is that some of the tag parsing (at least the id>filename conversion) is deported to the API instead of having all the parsing done at a single place (in the UI).

Another solution for the edition, is to add a field in the data returned by the API, which contains an array with IDs and file names of all images associated to the current doc :

{
    ID1 : file_name1
    ID2 : file_name2
    ID3 : file_name3
}

For the view page, these info are already downloaded to generate the image list.

tsauerwein commented 8 years ago

... of all images associated to the current doc

Can you only insert an image tag for images that are associated to the document? I think we don't want to enforce that constraint in v6?

But I like your idea of including the filenames on the returned document. The API would have to parse the text for image tags, but that is ok.

For the preview we still might need an API web-service which returns filenames for a list of image ids.

asaunier commented 7 years ago

I have tested enabling some standard Python-Markdown extensions with PR https://github.com/c2corg/v6_ui/pull/695 See my comments in it.

To summarize:

stef74 commented 7 years ago

@asaunier I propose to close ... for toc2 we can create dedicated issue with association label

asaunier commented 7 years ago

I prefer to leave the issue open because it contains many info and analysis about the tag parsing.

asaunier commented 7 years ago

Tutorial about extending python-markdown: https://github.com/waylan/Python-Markdown/wiki/Tutorial:-Writing-Extensions-for-Python-Markdown

asaunier commented 7 years ago

I have the feeling that using both bbcode and markdown formatting is a nonsense:

What about reyling only on the markdown parser?

It would mean we would have to convert the bbcode tags by markdown tags (at least the common ones) when migrating the data. But some converting is needed anyway: https://github.com/c2corg/v6_api/issues/417

asaunier commented 7 years ago

An other argument for getting rid of the bbcode: it will make things even more complicated when we will add a text formatting wizard/editor! See https://github.com/c2corg/v6_ui/issues/305

asaunier commented 7 years ago

I set this issue to milestone golive because it requires changes in the migration script, at least for some tag converting https://github.com/c2corg/v6_api/issues/417

fjacon commented 7 years ago

@arnaud-morvan or @tsauerwein : have you taken Alex last comment into account in migration script ?

tsauerwein commented 7 years ago

have you taken Alex last comment into account in migration script ?

The issues related to the migration are tracked here: https://github.com/c2corg/v6_api/issues/417 One point is left: https://github.com/c2corg/v6_api/issues/417#issuecomment-256397038

asaunier commented 7 years ago

One point is left: c2corg/v6_api#417 converting common bbcode tags to markdown??

This is an open question, I don't have the answer. Getting rid of bbcode tags would imply to make more conversions in the migration script (as done for some shortcuts and wikilinks in https://github.com/c2corg/v6_api/pull/478) but I have the feeling it might simplify things in the future to have only one formatting syntax instead of 2.

@stef74 Please note that I have not made the [toc] syntax simplification (removing unsupported options) (striked out in https://github.com/c2corg/v6_api/issues/417#issue-181414714) because @fbunoz was opposed to it (see https://github.com/c2corg/v6_ui/pull/695#issuecomment-253619271). As a result there are probably many pages whose TOC is broken, I am not sure it is really relevant.

stef74 commented 7 years ago

@asaunier this ticket can be close ? (dedicated ticket already exist) and no action on SA side ?

tsauerwein commented 7 years ago

I created a separate issue for the remaining problem (bbcode -> Markdown): https://github.com/c2corg/v6_api/issues/516

tsauerwein commented 7 years ago

Sorry that was the wrong issue.