Closed Rudd-O closed 3 months ago
To me the natural place for all high level uuid handling code would be plone.app.uuid
- linkintegrity is only one aspect of uuid handling. Can you elaborate why you choose that package?
Can you elaborate why you choose that package?
Because plone.app.linkintegrity
already is well-positioned to process the modifications — it already implements the proper object modification inbound hooks, and it deals with HTML markup already — whereas plone.app.uuid
does not appear to do so: it only offers the machinery to index and resolve UUIDs forward, and crucially doesn't know anything about content or markup.
We could also implement this proposal straight in another package (hypothetically plone.app.linkuidify
) but then we'd have to figure out a way to order the modification hooks before plone.app.linkintegrity
so the linkintegrity machinery receives the post-uidified URLs in its content modified hooks — IOW ordering of event processing becomes an issue, which happens to be solved for the output pipeline, but (to my knowledge) not for the input one. linkintegrity would probably depend on this new package then.
We could also possibly implement this as a Plone transform (it is extremely unclear to me how we'd do it). We'd still have to care about the ordering problem, though.
Context for plone.app.uuid
:
In general +1 for the direction, but it is quite some work, not sure what milestone you have in mind, for Plone 6.0 it looks like a bit too much. But if it is non breaking can be done later too. I think plone.app.linkintegrity is the better place, at the end it's about link integrity. Please no addition package ;). Would you work on this, when it's accepted by the FWT?
This would be a non-breaking change. Even if we were to back the change out, Plone already has included — for a very long time — the capability to forward-resolve from UID to URL, so any users affected by this change being backed out would not experience any content breakage at all.
It's not that much work — the meat of the code already exists in c.ei
(ask @pbauer — he wrote most of it!), and I estimate wiring the existing code should clock in at less than 500 LOC, probably substantially less. Here is what we have, which I have already made sure is consistent with resolveuid handling throughout the rest of Plone:
Even regarding linkintegrification of existing content, at no point in time will we ask future Plone users to do any sort of migratory step if they dont want to, but we can kindly point them to collective.exportimport
if they do positively want to apply link integrity to existing content, as the tool already has this baked in. Of course, future work may decide to port that feature into Plone core — but that would be out of scope for this project.
I would most likely not work full time on this, but I have a substantial sum of money (EUR / CHF / USD / BTC offered) set aside to fund this project, to whomever in the Plone community wants to tackle this. I will also be reviewing and likely making minor changes to the code as it evolves.
And of course the result will be as free software as Plone itself.
@Rudd-O can you elaborate your plans regarding Plone 6 and especially Volto? Both link integrity and resolveuid might be subject to change in the future. We have to make both features fully available via REST API and usable for any consumer system (mainly but not only Volto). We will also have to think about making all TinyMCE transformations optional, since there is no TinyMCE in Plone 6 / Volto.
For Volto we are currently in the process of moving from DraftJS to Slate editor. When this is done we plan to approach the linkintegrity use case (resolveuid works already). It does not make much sense to put effort into linkintegrity for DraftJS when we move to Slate before Plone 6. Those efforts might align nicely with this PLIP. Though, I would be strongly against a large refactoring of code that does not take the Plone 6 / Volto use case into consideration.
I also agree with @MrTango that this might be a big change and it is too late for Plone 6.
Implement two IResolveUIDer adapters, one for ATCT and one for Dexterity.
Forget about the ATCT one. That is Archetypes, so only Plone 5.2 on Python 2.7. That is not something I as release manager would want such a big change in: too risky for a Plone version that should remain stable.
Archetypes
Fixed by removing that part from the proposal.
@Rudd-O can you elaborate your plans regarding Plone 6 and especially Volto? Both link integrity and resolveuid might be subject to change in the future. We have to make both features fully available via REST API and usable for any consumer system (mainly but not only Volto). We will also have to think about making all TinyMCE transformations optional, since there is no TinyMCE in Plone 6 / Volto.
In short: Volto will not be impacted — however, Volto developers may be able to save a lot of hard-to-get-right labor in the future.
The proposal involves only embedding the resolveuidification (currently done by TinyMCE) into Plone core backend.
From Volto's perspective, Volto will continue to POST content to be saved to Plone, just as it does today. All that would happen is that Plone would, in the background, change internal references to be resolveuidified.
For Volto we are currently in the process of moving from DraftJS to Slate editor. When this is done we plan to approach the linkintegrity use case (resolveuid works already). It does not make much sense to put effort into linkintegrity for DraftJS when we move to Slate before Plone 6. Those efforts might align nicely with this PLIP. Though, I would be strongly against a large refactoring of code that does not take the Plone 6 / Volto use case into consideration.
That's great news! It means that the Slate editor would also automatically benefit from this work.
If this proposal is implemented, people working on the Volto side will have to do no effort whatsoever to benefit from automatic resolveuidification. Every RichText content saved by Volto will automatically be resolveuidified.
There will be no large refactoring of any part of any codebase as part of this proposal. No API surfaces will change.
@Rudd-O My five cents!
Since some time ago, we have the "Blocks Transforms" for serializers/deserializers in plone.restapi.
https://plonerestapi.readthedocs.io/en/latest/blocks.html?highlight=transform#block-serializers
It is a powerful tool that we developed not just for solving resolveuid subject in Volto blocks, but for other purposes (indexing, injecting data, etc). Basically, it's a ZCA-only powered portal_transforms-like tool. I think we should consider to make this idea backend wide, not only for Volto blocks, so it can be reused everywhere we need to transform data back and forth. At some point we started toying with the idea to apply the blocks transforms to content types and fields themselves.
We need to synchronise efforts so we do not end up with two tools/ways to achieve the same thing, or worse, yet another portal_transform tool.
Some pointers on how resolveuid is solved with blocks transforms:
ResolveUID deserializer https://github.com/plone/plone.restapi/blob/master/src/plone/restapi/deserializer/blocks.py#L91
ResolveUID serializer https://github.com/plone/plone.restapi/blob/master/src/plone/restapi/serializer/blocks.py#L79
I can imagine that the way to generate/resolve the resolveuid using unified interfaces and entry points could (and should be improved) so I'm looking forward to this PLIP.
If I may suggest a couple of things, first one being, that instead of spreading core features in satellite packages, we either start concentrating them into the core packages. We talked a bit about this with @jensens during the Plone Conference at Sorrento and it feels to me that everybody agrees on that.
Finally, I think that Plone 6 frontend should be included in the "Implementation Proposal" section so we make sure that we do not implement (and merge) a PLIP that only works in classic and we work on PLIPs that are stack-wide aware and provide vertical solutions rather than horizontal ones.
@sneridagh It looks like the uid_to_url and url_to_uid functions need to be updated to deal with whole tags rather than single URLs (as URL processing in some cases requires "multiple URLs" in a single URL, witness the case of source srcset=
and friends). The code in collective.exportimport does this already.
I still think the most vertical solution is to rely on the PortalTransforms machinery rather than writing ad-hoc ser/deser code, which is (to my limited understanding) what the code you linked currently does. If that code used PortalTransforms from the start, it would benefit from this PLIP out of the box.
EDIT: I'm specifically talking about https://github.com/plone/plone.restapi/blob/master/src/plone/restapi/serializer/blocks.py#L89 which is code that does (less of) what https://github.com/collective/collective.exportimport/blob/af403e2c21cee549f40937bb3ce6f277cc7212be/src/collective/exportimport/fix_html.py#L92 does. If the plan above materializes, and those blocks use the PortalTransforms machinery — as they should, if they are rich text HTML fields — then you don't need any of this code and it can all safely be deleted.
@sneridagh It looks like the uid_to_url and url_to_uid functions need to be updated to deal with whole tags rather than single URLs (as URL processing in some cases requires "multiple URLs" in a single URL, witness the case of
source srcset=
and friends). The code in collective.exportimport does this already.
We do not deal with tags... nor with HTML as a whole, we deal with JSON. Slate saves to JSON. We deal with tags only on rendering, since tags are only rendered when they should be, in JSX. Which is the whole point. That code runs on every URL that it finds given a specific block.
I still think the most vertical solution is to rely on the PortalTransforms machinery rather than writing ad-hoc ser/deser code, which is (to my limited understanding) what the code you linked currently does. If that code used PortalTransforms from the start, it would benefit from this PLIP out of the box.
We implemented the blocks transforms because a reason, and they provide functionality beyond what PortalTransforms provide right now. It's not add-hoc, it's the evolution of PortalTransforms.
EDIT: I'm specifically talking about https://github.com/plone/plone.restapi/blob/master/src/plone/restapi/serializer/blocks.py#L89 which is code that does (less of) what https://github.com/collective/collective.exportimport/blob/af403e2c21cee549f40937bb3ce6f277cc7212be/src/collective/exportimport/fix_html.py#L92 does. If the plan above materializes, and those blocks use the PortalTransforms machinery — as they should, if they are rich text HTML fields — then you don't need any of this code and it can all safely be deleted.
No it's not. that one gets HTML (via it's soup representation) and fixes it (as a whole). Where that piece of code is used? and for what purpose? Again, we can't get rid of the Blocks transforms because the PortalTransforms are not that flexible.
As said, I think that the blocks transforms serializers/deserializers idea and features is great (not my making, @tiberiuichim and @cekk to blame) and I really think that it's quite a powerful one, and one that could be bended even more to work with fields (and specifically with rich text fields).
Please, @plone/framework-team due to the verticality of this PLIP, whenever is going to be discussed, it would be great if you please invite the @plone/volto-team to discuss it.
OK @sneridagh, I may have misunderstood — I thought these blocks dealt with HTML rich text fields.
This PLIP is strictly limited to dealing with uniqueid'ifying HTML rich text fields as they are fed into Plone for storage. It does not influence the rendering of any data sources, JSX or HTML — within the scope of this PLIP, that remains untouched and the same.
So it's really orthogonal work which will not influence Volto in any way, and will benefit Classic. Of course, more eyes are welcome.
I still support the basic idea (details need discussion), but I veto against adding IResolveUIDer interface and implementation to p.a.linkinterity - it's place is plone.app.uuid
. It is all out of scope of what I consider a linkintegrity checker package to do.
Overall I would like to see an implementation scanning arbitrary (not limited to HTML) text for links, a detection if they are internal and not already based on UUIDs, then replace them by matching UUID links. (and vice versa). This basic/generic and robust/very well tested implementation then can be used in several places, like for JSON/slate, for HTML/TinyMCE, at export/import time and so on.
plone.app.uuid
seems to me lower in the dependency stack than plone.app.linkintegrity
— why should plone.app.uid gain HTML parsing functionality as part of this project, when plone.app.linkintegrity
is where that sort of thing is happening nowadays?
Alternatively, I can create a separate package. Either way, plone.app.linkintegrity
is going to use code from that hypothetical new package, and plone.app.uuid
won't.
@jensens your view?
(No disagreement on the second comment, albeit I have no plans for doing this to the Markdown interfaces — there is no precedent for Plone to do that kind of transformation.)
My POV:
a) no new packages!
b) Place code where people would expect it. I do not expect IResolveUID
in something called linkintegrity, getting my point?
c) pa.linkintegrity can depend on pa.uuid - why not.
Please sketch
(No disagreement on the second comment, albeit I have no plans for doing this to the Markdown interfaces — there is no precedent for Plone to do that kind of transformation.)
Well, if you break it down its about finding UUID-links in text, replacing them with full URL links. It does not matter much what the text is semantically.
Well, if you break it down its about finding UUID-links in text, replacing them with full URL links. It does not matter much what the text is semantically.
Not at all. @jensens I get the vibe that you believe the purpose of this PLIP is the antipode of what it actually is.
This PR does not touch the part of Plone that renders UUID links into proper HTML URLs. That will remain untouched.
The purpose of this PR is the opposite — evaluate HTML URLs and see if they can be transformed to UUID links, then transform them if possible.
My plan so far involves:
c.exportimport
that already does this, placing it here. This is 100% code reuse, and the code is well-tested now.This modification would be preferably happening to p.a.linkintegrity
because the purpose of this PR is to help preserve link integrity, and that's also what the purpose of the link integrity package is — to preserve link integrity.
No new packages!
Anyway, reverse lookup would fit in plone.app.uuid
as well.
However, you need to get the PLIP out of draft mode and the official path over the framework team (I am not part of any longer) applies, who will then review it.
No activity since 2022. Closing this PLIP. Feel free to re-open if anyone disagrees.
No activity since 2022. Closing this PLIP. Feel free to re-open if anyone disagrees.
Still open.
I close this for now. If you plan to work further on this please reopen and go on.
PLIP (Plone Improvement Proposal)
Esteemed @plone/framework-team - I am submitting this PLIP for your consideration.
Responsible person
Proposer: @Rudd-O
Seconder: still needed
Abstract
This PLIP aims to improve both Plone's rich media story as well as link integrity, by centralizing the generation of resolveuid URLs which aid in the robustness of Plone content. Both Plone Classic and Volto users will benefit, transparently, without any codebase changes needed higher in the stack.
TL;DR: we propose to implement resolveuidification — the counterpart of portal_transforms' resolveuid to URL.
The scope is limited strictly to the intersection between Plone 6 and
text/html
Rich Text fields in Dexterity content types.Overview, context and background
Plone has a very robust, best-in-class link integrity feature that couples three disparate aspects to achieve strong robustness with hyperlinks to content:
a href
,img/video/audio src
, andsrcset
.This PLIP tackles the matter number 2 in the list above — the way that resolveuid URLs are generated is not consistent / has "holes". For example, manually-entered hyperlinks or image references are not resolved to resolveuid URLs — only TinyMCE-created ones (via TinyMCE GUI plugins) are. Furthermore, other types of linkable content objects, such as IFRAMEs and VIDEO / AUDIO / srcset= links, are also not rendered as resolveuid URLs.
Aside: there exist products like collective.wildcardmedia which abuse
a href
s to embed audio / video, in order to address this functionality gap. This sort of handling of the issue offers the Plone user considerably less control of the final markup used to render their embedded media content, relies on client-side JS to have any effect at all, and does not cover embeddable content like PDFs.Given the many ways in which users may add content to their Plone sites (via TinyMCE, or imports, or raw HTML, or Volto, or even Python scripts), it makes sense to centralize the resolveuid generation in a single place that applies to all of them, and then rip out the other code where these links are generated — to reduce our maintenance burden.
It is furthermore — at least as of today — arguably suboptimal layering to have this functionality in a TinyMCE Plone pattern, hidden in various layers of indirection that need to be reimplemented by each content type being inserted (link, image, and the as-of-now-still-unable-to-resolveuid media insertion plugin). It would also be suboptimal if resolveuidification had to be implemented in Volto itself — it would require exposing more of the Plone API, and multiple roundtrips (at least one per hyperlink) to resolveuidify content that will be saved by Volto.
Concrete implementation details proposed
Note: All of these items will be gated to the linkintegrity registry setting, as the preexisting functionality already is.
data-val=<UID>
attributes with the UID of the object, which is unknown to me what they are used for (perhaps to reflexively help TinyMCE later?). Perhaps IResolveUIDer would add those too, for completeness. It appears that these attributes do survive the roundtrip when the/edit
view is invoked?@@fix_html
-- I have been using it regularly to ensure that my new content's links is rewritten in resolveuid form, and it works great (but it is annoying to use a view from a non-core add-on manually every time prior to publishing).text/html
fields benefits from automatic resolveuidification. This includes both Volto and TinyMCE.Stretch goal:
Out of scope (currently) but would be real nice to have:
collective.exportimport
if they do positively want to apply link integrity to existing content, as the tool already has this baked in.c.ei
feature into Plone core, either as a CP configlet or as a migration step — but any of that would be out of scope for this project.Risks
Funding
If we can get enough consensus on the implementation details, I am willing to fund the development of this PLIP till completion. The reason I propose to fund core development is that I am likely not capable enough to implement this by myself.