salesagility / SuiteCRM-Core

SuiteCRM - Open source CRM for the world
https://www.suitecrm.com
GNU Affero General Public License v3.0
173 stars 118 forks source link

Views Incorrectly Encode UTF Characters as HTML Entities #274

Open vladaman opened 1 year ago

vladaman commented 1 year ago

Issue

The current behavior in the views section of the application involves encoding UTF characters as HTML entities.

Actual Behavior

When displaying Accounts, Notes and Contacts views the links are showing with HTML entities.

Screenshot from 2023-06-02 11-28-19

Possible Fix

Decode HTML entities into UTF-8 Behavior is working correctly on SuiteCRM 8.3.0 at https://suite8demo.suiteondemand.com/ in English Version.

Steps to Reproduce

  1. Change SuiteCRM to non-english language (issue may be in
  2. Create an Account with non-english characters like "šččřž"
  3. Display list of Accounts

Your Environment

johnM2401 commented 1 year ago

Hey @vladaman

I've had a test on both Demo and locally using the characters you provided, only the first one seems to be encoded incorrectly: image

(i'll mark this as a bug as there do seem to be some characters that are encoding incorrectly)


However, most other characters i've tried DO seem to be encoding well.

Do you have any further examples of characters that you found encode poorly?


As the issue seems much more prevalent on your local environment, it might be worth checking a few things:

Thanks!

lukio commented 1 year ago

Hi @johnM2401 @vladaman I can confirm this issue at the field description. At detail/edit seems to render as expected, but when it's rendered at listview the description field renders with wrong enconding.

DetailView

I also capture the graphql response and at detail (when getRecord) is called the description field came with the characteres as expected.

Captura de pantalla de 2023-06-08 14-06-09

ListView

The graphql response (getRecordList) brings the description field with wrongs characters as you can see.

Captura de pantalla de 2023-06-08 14-06-58

I verify and the output of the php -i and php -m are OK.

lukio commented 1 year ago

I also verify the issue with parent_name field at https://suite8demo.suiteondemand.com/

Captura de pantalla de 2023-06-08 14-35-32

pgorod commented 1 year ago

What about the DB? IS it wrong in the DB?

There should be no HTML escaping in the database. If we do this wrong (which we do in v7!) all over the place, then we need to revert the damage in hundreds of places in the UI, one by one.

Then, when we fix the central problem, we will need to fix back all the previous fixes, bug by bug, one by one. We should really get a grip on this "security clean-ups" issue, otherwise we'll just keep adding to the technical debt...

lukio commented 1 year ago

Hi! I believe that those kind of change is too mucho. Just for now, we would re-encode those fields when retrieving data as getRecord should be doing.

pgorod commented 8 months ago

I'm tying to reproduce the issue locally but I can't. I'm on 8.4.1, maybe that's why.

Is it really necessary to change the SuiteCRM language to something non-english? That's a lot of work which I am trying to avoid. Did anyone reproduce this in a plain English version? How, exactly?

Thanks

yunusyerli1 commented 2 months ago

Hi @johnM2401 @vladaman, I have tested this issue with a new charachters as "Şebnem Çakğır šččřž 안녕하세요감사합니다ΚαλημέραΕυχαριστώสวัสดีनमस्ते". And I realized that the issue exists only on listviews. If network tab is checked, parant_name is encoded incorrectly. I have checked the db as it is encoded correctly.

char_encoding1

char_encoding_db

chris001 commented 2 months ago

@yunusyerli1 Nice work. Can you double check, when you look at the ListView's HTML page source code (right click on list view, click View Page Source), is the field double-encoded, or single-encoded?