shaarli / Shaarli

The personal, minimalist, super-fast, database free, bookmarking service - community repo
https://shaarli.readthedocs.io/
Other
3.45k stars 296 forks source link

Fail to rename tags with é or è characters #2068

Open landure opened 9 months ago

landure commented 9 months ago

I'm using Shaarli v0.13.0. The issue was also present on v0.12.*.

I've got tags with french characters, such as éditeur. Renaming this tag fails, both when using admin/tags and tags/list.

The issue seems to come from $this->container->bookmarkService->search() that don't return any bookmarks when matching for a tag with accentuated characters in the name.

Thank you for your work.

nodiscc commented 9 months ago

Hi,

I can't reproduce this on https://demo.shaarli.org/ (master) and on my own instance (v0.13.0, installed from release zip).

Can you share more details about your setup? Installation method? Detailed installation/configuration steps? If installed from release zip or source/composer, output of $ grep -v "#" /etc/locale.gen on the server?

Thanks

landure commented 9 months ago

I installed Shaarli using Docker and ghcr.io/shaarli/shaarli:release image behind a Traefik router and I use material theme.

Associated to this comment is an archive with my config.json.php (with secret expunged) and datastore.php files.

data.tar.gz.

The issue may be due to an issue in my datastore, but I exported it to HTML and imported it in a clean Shaarli instance without solving the issue.

Thank you.

nodiscc commented 9 months ago

Does this happen if you switch back to the default theme?

landure commented 9 months ago

Yes it does happen with the default theme too. I tried with fresh import from html, default theme and no plugin to make sure if was an issue with shaarli

nodiscc commented 9 months ago

To make sure I'm interpreting this correctly, can you reproduce this on the demo instance? (I could not, but maybe I'm missing something)

If not, then the only difference that is apparent to me with the info I have, is the Shaarli version (the demo instance is redeployed from the :latest image every day as far as I know, while you use the :0.13.0 image). Maybe you could try with the :latest image (backup your data first, or use a separate container, etc), and report if the bug is still present for you on :latest?

If it is still present, it may indicate something wrong with another component, e.g. your reverse proxy configuration - in which case can you post your Traefik config?

Or I could try to reproduce with the :0.13.0 image but it may take a while since I don't have docker setup at hand right now

landure commented 9 months ago

I've tried with the latest image, without change, but I think I found the "core" of the issue. During my tests, I've found the issue to be fixed by editing the entry and saving it without change.

With further investigation, i've found that in the HTML export of my database, éditeur is encoded as éditeur. When using tag renaming with éditeur instead of éditeur, it works. Please note that my database is very old. I now remember that part of it was imported from Semantic Scuttle, which was the tool i used before switching to Shaarli.

So, the issue should be spit in two:

nodiscc commented 9 months ago

Good debugging job on this issue @landure, thank you

So, the issue should be spit in two: HTML import doesn't convert HTML entities to UTF-8. Tag renaming doesn't support HTML entities in database.

I agree, want to do it? Just remember to include the example problematic bookmark from the HTML export, and link back to this issue for details.

Thanks again