atg / chocolat-public

Public bug tracker for the private chocolat project
http://chocolatapp.com
182 stars 4 forks source link

Chocolat does not honor end of line types #1744

Open milodorn opened 8 years ago

milodorn commented 8 years ago

Hello,

Since there is no option to select end of line type, this is really a big issue once you need to edit e.g. config files. Mostly config files with different line endings are not accepted so when you edit such a file in Chocolat and save it you end up with a broken file. It seems fine but it is not accepted anymore as a valid config file.

Could you either: A) make Chocolat honor end of line type when editing files (probably easiest to implement) or B) could you add an option to unify mixed end of line types by choosing one of the end of line types when saving a file (harder to implement but more versatile)

Thank you in advance in implementing this soon.

Best regards, Miloslav Dorňák

atg commented 8 years ago

Chocolat should be doing all of this. The way I implemented it:

  1. We detect the type of line ending when the file is opened
  2. When we save, all line endings are converted to that type.

So if it's not doing that, it's a bug.

milodorn commented 8 years ago

Well, it should, but something is wrong. Attaching a file where you could test it. When edited in Chocolat and used it was rejected. When edited in Textmate and tested it was ok.

smb.conf.txt

atg commented 8 years ago

I did a hexdump and I only see 0A in there (LF), no 0D (carriage return).

Also, for me, there is no change:

$ md5 before.txt 
MD5 (before.txt) = 6bf90a49cbcd2a3f0427d5c899f5bdcc
$ md5 after.txt 
MD5 (after.txt) = 6bf90a49cbcd2a3f0427d5c899f5bdcc
atg commented 8 years ago

I see there is a line

server string = 

Notice the trailing space. Do you have whitespace trimming on? That could contribute to a difference between the files.

milodorn commented 8 years ago

I am sorry, I have edited the file with Textmate after it was not accepted when edited by Chocolat so maybe it fixed it.

I am not able to isolate the issue on a smaller scale but see attached srt files. The original is untouched file and is a valid srt file. The textmate-edited and chocolate-edited are files after I deleted the last srt entry and saved the files. The screenshot is when I try to add the chocolate-edited file - it is no longer recognized as a valid srt file. I have double checked now that it is a replicable issue and I have also checked MD5 checksums which differ. I do not have trimming enabled as you can see in chocolat-prefs screenshot.

chocolat-edited chocolat-edited.srt.txt original.srt.txt textmate-edited.srt.txt chocolat-prefs

atg commented 8 years ago

aha

$ file original.srt.txt 
original.srt.txt: UTF-8 Unicode (with BOM) English text, with CRLF line terminators```

Chocolat is stripping out the UTF-8 BOM, because UTF-8 files shouldn't have BOMs. Clearly whatever is reading the file is requires UTF-8 BOM, and that is the conflict here.

milodorn commented 8 years ago

You are right. Thank you for cracking this. :-) I have tried all end of line combos all without UTF-8 BOM and neither was accepted. I understand that by specs UTF-8 BOM is not required nor recommended. However, it is not forbidden and as it is clearly required in some rare cases would it be possible to keep it if it is already there? Or at least give user a choice to keep it when saving? And of course the golden standard would be an option to add/remove it while Saving as...? I know that now this is a feature request but it is still relevant and could be seen as a "bug" by users because unfortunatelly at this moment Chocolat can "corrupt" files even when it is in good faith. Do you think this feature could be added to Chocolat anytime soon? I kinda love using Chocolat but I have to still keep Textmate around for some rare cases like this one. Thank you.

atg commented 8 years ago

Yeah, it's only 3 bytes but it causes big problems. Even if Chocolat generates the BOM, other Unix tools may not understand it. Since the Mac is a Unix system, my view is to not generate the BOM, to ensure maximum compatibility.

I guess we could add an advanced BOM-generating-mode for times like this. But really the best thing to do would be to contact the author of whatever is rejecting your configuration file, and insist that they support standard conforming UTF-8.