OpenTermsArchive / engine

Tracks contractual documents and exposes changes to the terms of online services.
https://opentermsarchive.org
European Union Public License 1.2
106 stars 30 forks source link

Handle all cases of email protection to prevent blinking and ease maintenance #959

Closed martinratinaud closed 1 year ago

martinratinaud commented 1 year ago

Here is an implementation of a global filter that aims at removing blink on several documents in an harmonized way.

For now, there has been an implementation that does not cover all cases and this lack of universality in the process is causing blinks on several instances and forcing contributors to create a filter for which we already have at least 3 versions.

Here are some affected documents

Thanks to this PR, blink will not occur anymore on any instance and will remove the need of custom filters, thus improcing stability and maintainability

martinratinaud commented 1 year ago

Noted.

I hope we find a way to handle this easily as if it was to be removed, it would create many blinks that would make maintenance harder.

Concerning the test case, it has been replaced by the new way of handling this kind of links so I'm not sure what you mean by

the test cases updated here do not demonstrate the improvements proposed

Can you please be more specific?

MattiSG commented 1 year ago

if it was to be removed, it would create many blinks that would make maintenance harder

I totally agree with you, the goal is really not to remove this but to replace it with something that is more predictable, visible and scalable so we can add more of those 🙂

it has been replaced by the new way of handling this kind of links

Exactly 🙂 So we don't prove that:

  1. We keep backwards compatibility (this would be proven by keeping the existing test).
  2. We handle the variations in email protection replacements that this changeset claims to add support for (this would be proven by changing the number of characters that are mangled).

In my view, a proper test for this changeset would have integrate a real-life example, or at least a distilled version of it. Right now, if we need to evolve the test for it to pass, we might have introduced a regression 🙂

Ndpnt commented 1 year ago

@martinratinaud @MattiSG What need to be done on this PR to close it?

martinratinaud commented 1 year ago

@MattiSG asked to rewrite the test cases but I have a workshop tomorrow and have no time to take care of that now.

Feel free to change it if you find the time.

Ndpnt commented 1 year ago

@martinratinaud I have looked at the examples you have given us but, as far as I have seen, none of them were not managed before this change. I feel like this modification is just a more readable refactor for the same behavior. Can you confirm or give us a specific example of how this change is improving?

martinratinaud commented 1 year ago

@Ndpnt on this snapshot, line 616

<a href="/cdn-cgi/l/email-protection#2d4e4243594c4e596d4e4459545e4e424259034858">conta<span class="__cf_email__" data-cfemail="87e4f3c7e4eef3fef4e4e8e8f3a9e2f2">[email&#160;protected]</span></a>,

You can see the conta<span

In the previous version, the code was only replacing the href whereas this version replaces the whole content of the a.

It is true though that it will not work on BlaBlaCar as the link is http://dataprotection@blablacar.com. Maybe we should add a test for dataprotection@*

I pushed a test for this behaviour

let me know if it's alright