albfan / jmeld

A visual diff and merge tool
44 stars 31 forks source link

Encoding problems #66

Closed mrkaban closed 2 years ago

mrkaban commented 2 years ago

Even if you set utf-8, it still displays the Russian text incorrectly. 111

albfan commented 2 years ago

Is it possible to add an url to a repo where this problems can be shown?

Or two files to compare, I guess encoding is something different

mrkaban commented 2 years ago

I attach the file that was on the left. main.zip I also attach a link to the file that was on the right.

albfan commented 2 years ago

Looks I open the diff with files in the opposite way, but it's rendering correctly (I don't expect jmeld to modify sources in any way)

Captura de pantalla de 2021-12-01 00-22-42

We can reopen if you provide any further info that causes this, but looks like something in your environment

mrkaban commented 2 years ago

Here are the same files in Meld Meld

mrkaban commented 2 years ago

I put kdiff3 and there is no such problem there, the conclusion is simple, the problem is only in jmeld 111

albfan commented 2 years ago

The conclusion is there's a problem in your java environment. kdiff3 is a C program as far as I know.

Did you see my screenshot?

mrkaban commented 2 years ago

Does it bother you that the screenshots immediately show that I checked on different computers? In other programs using the Java environment, similar problems are not observed. Are you saying that this is a problem with Java, not with the program? Despite the fact that such a problem on different operating systems. Here are the screenshots. 111 222 333

mrkaban commented 2 years ago

Yes, yes, it's all a Java environment that I just installed and everything works out of the box without additional settings.

There is such a term "made for people" - this is when they tell you not that you are doing something wrong, but ask you to throw off the logs and analyzing them, they say what the error is.

albfan commented 2 years ago

I suggest to attach both files you're comparing.

git checkout can do nasty things on encoding. I'm unsure if meld or kdiff3 has some special encoding parsers (I doubt) so I guess your java environment is getting in the way.

But I don't know how

mrkaban commented 2 years ago

It often happens that complex things are done very simply - the main thing is to start and do it in small steps.

I am attaching both files. main1.zip main2.zip

albfan commented 2 years ago

still working out of the box.

I was able to reproduce using -Dfile.encoding=cp1251 so I suppose your a windows user complaining to an open source maintainer about shitty OS decision you made.

Funnily enough:

$ rg 1251
main.pas
247:     S := UTF8ToCP1251(S + CHR(13));
255:       S := UTF8ToCP1251(S + CHR(13));
274:     S := UTF8ToCP1251(S + CHR(13));
282:       S := UTF8ToCP1251(S + CHR(13));
357:t.add(SysToUTF8(s));     //   UTF8ToCP1251
406:t.add(SysToUTF8(s));     //   UTF8ToCP1251
816:          ListBox1.Items.Add(CP1251ToUTF8(NameKey));
890:           MyList2.Add(CP1251ToUTF8(RedString('DisplayName')));
909:           MyList2.Add(CP1251ToUTF8(ReadString('DisplayName')));
1662:           ListBox1.Items.Add(CP1251ToUTF8(NameKey));
...

UTF8ToCP1251 and CP1251ToUTF8 are a hint that you're doing something wrong and asking the world to fit your environment.

Probably you have a JAVA_TOOL_OPTIONS defined globally. You can get latest release on https://github.com/albfan/jmeld/releases/tag/3.6.0 and check yourself:

JAVA_TOOL_OPTIONS=-Dfile.encoding=cp1251 java -jar jmeld-3.6.0-jar-with-dependencies.jar main1.pas main2.pas

BONUS: I miss a way to check system properties so I create: https://github.com/albfan/javaProperties to check them.

mrkaban commented 2 years ago

1) No need for insults. I didn't insult you or your code - it was just that there was an error in the program, and you insult me personally.

2) You are wrong about the situation, because you have not figured it out. I am the owner of several large portals about free software (mostly in Russian, but there are also in English). We are talking about free software, which is distributed under licenses approved by the Free Software Foundation. And on some of them I lead the category of free software for Windows (yes, it's popular - free software on a non-free OS). And I added your program to the site, by the way, a good description is written (in the sense that I didn't write anything bad about it, only positively).

3) The submitted files relate to a program that I abandoned about five years ago. Regarding the example of encoding conversion you have given, it is written in FreePascal. You probably haven't worked with him, since you don't understand the meaning? UTF-8 doesn't work out of the box in it, so you need to distill variables containing Russian letters into utf-8, but some Lazarus components and libraries don't work as they should with this encoding. I love free software, but this compiler is too problematic (in my personal opinion).

4) I have not studied Java, and accordingly, I have not changed any parameters globally. I installed Java and JDK itself, since some of the free programs (gnu gpl) didn't work without it. I haven't changed anything in the Java environment settings and I can't imagine how this is done.

5) It doesn't matter if the program is paid or free, the approach is important. You may not have time to support it, but you can use the "made for people" approach. What does it mean? This means that the program can be updated extremely rarely, but when creating it, be guided by what the users of your program want to see, not how you see it. If you write a program with such that the user will figure it out himself, then there will be no popularity. Even experts don't want to bother now, opened the program - something is wrong, closed it and went to look for the next one, in which this problem won't be out of the box. And it will not matter to him whether the problem is with him or in the program, since with the right approach, the program should provide for possible user errors. By the way, the program with the bad code you specified is searched for in the search with a frequency of approximately 300 sets per month.

6) how to switch JAVA_TOOL_OPTIONS?

albfan commented 2 years ago

You act entitled all the time. I provide support even when the problem was obvious (hoping you get how wrong you are behaving) an issue is always a chance to learn how to contribute

You still expect me to teach you how java, cp251 Windows encoding works. Go and open issues on java, Windows and see how much attention you get.

I suggest to run the program I create to check your final config on java.

If you have absolutely no idea how java works, that's fine, but think twice before say, this is not working for me, so it's your fault.

java -Dfile.encoding=utf-8 -jar jmeld-3.6.0-jar-with-dependencies.jar main1.pas main2.pas

Should fix your problem. If you run the test program I suggest you could check if your file encoding is set to cp1251.

About how that happens, maybe is the default on java for Windows, or maybe you have JAVA_TOOL_OPTIONS in environment variables

mrkaban commented 2 years ago

You're an ordinary windbag who can't do anything but talk. Does not work, checked on Linux Mint 20.2.

Am I behaving incorrectly? Who are you to teach me anyway? I didn't teach you, and with a simple movement I made a catalog with an attendance of ~9,000 people a day, and a number of paid programs, and you write programs that don't even get on softpedia. And I know more languages than you, except only java. But if a girl like you, you want to assert yourself so much, then please, only when you receive 100 euros for work, do not ask why so little.

I'm deleting your program, since there is something to replace it with. You are blacklisted and the international anti-spam list.

albfan commented 2 years ago

You still get it wrong.

I give support to this issues because that could help someone in same situation as you.

I don't care if you like or not this software, or if you use it or list it or whatever. Is free software and you can do whatever you want with it.

You have problems to understand that setup correctly java is your duty not mine.

I don't see "a girl like you" like an insult, try to think what that explains about yourself

This is the probe that is the solution:

java -Dfile.encoding=cp1251 -jar target/jmeld-3.6.0-jar-with-dependencies.jar main1.pas main2.pas

Captura de pantalla de 2021-12-04 07-47-04

java -Dfile.encoding=cp1251 -jar target/jmeld-3.6.0-jar-with-dependencies.jar main1.pas main2.pas

Captura de pantalla de 2021-12-04 07-48-07

now, let's check the correct place you should go to find why it does not work for you:

This is not a bug. The "file.encoding" property is not required by the J2SE platform specification; it's an internal detail of Sun's implementations and should not be examined or modified by user code. It's also intended to be read-only;

So this might or might now work for you.

My locales is:

$ echo $LANG
es_ES.UTF-8

I don't have russian locales with cp1251

$ locale -a | grep ru
ru_RU
ru_RU.iso88595
ru_RU.koi8r
ru_RU.utf8
russian
ru_UA
ru_UA.koi8u
ru_UA.utf8

But two locales contains cp1251:

$ locale -a | grep cp1251
be_BY.cp1251
bg_BG.cp1251

I create a single line to avoid scroll to check that 1727 line:

text-check.txt

LANG=be_BY.cp1251 java -jar jmeld-3.6.0-jar-with-dependencies.jar text-check text-check

Captura de pantalla de 2021-12-04 08-12-42

LANG=be_BY.utf8 java -jar jmeld-3.6.0-jar-with-dependencies.jar text-check text-check

Captura de pantalla de 2021-12-04 08-13-52

I guess this can be bundled with:

jmeld

#!/bin/bash
LANG=C.utf8 java -jar jmeld-3.6.0-jar-with-dependencies.jar $@

Just out of curiosity:

LANG=be_BY.cp1251 meld text-check.txt text-check.txt

Captura de pantalla de 2021-12-04 08-19-52

meld just ignores your setup and defaults to utf8 (see the bottom status bars of each textArea)

Hope this helps someone other than @mrkaban which sadly is not using jmeld anymore. :wink: