TheInfiniteKind / moneydance_open

16 stars 12 forks source link

Add Byte Order Mark (BOM_UTF8) to Reports>Save>CSV / Text files so that Excel double-click to open works OK #80

Closed yogi1967 closed 2 years ago

yogi1967 commented 2 years ago

NOTE: Something has changed for Windows users between 4071 and 4072 - so they are noticing this issue....

When you use MD Reports and click Save to create a text or CSV file, Excel has problems opening via a double-click. Basically, on a Mac, the MD text CSV file is written as Unicode (great), but by default, Excel opens files as ‘Windows’ (which is probably Latin or something).. This means that ‘£’ signs come out as ‘¬£’ and tick marks as ‘‚úì‘. BUT, if you open a new sheet, then File/Import, then change the filetype to Unicode UTF-8 then it imports great. See screenshots.

Screenshot 2022-04-22 at 08 19 25 Screenshot 2022-04-22 at 08 19 13 Screenshot 2022-04-22 at 08 14 48

I had this in the early days with my export_data extension... I fixed this by adding a Byte Order Mark (BOM) to the very beginning of the file..: Refer: https://en.wikipedia.org/wiki/Byte_order_mark

In Python this is codecs.BOM_UTF8 (b’\xef\xbb\xbf’) - or directly write Unicode \ufeff or three bytes 0xEF,0xBB,0xBF at the beginning of the UTF-8 file.

My extension(s) have a parameter so the user can turn this on / off as needed....

In MD, this is at: Report->Save->Comma Delimited or Text File com.moneydance.apps.md.view.gui.reporttool.SaveReportWindow with ExportFormat.COMMADEL com.moneydance.apps.md.view.gui.reporttool.ReportDelimitedExporter.export(Report, OutputStream)

Something has changed on Windows between 4071 and 4072. On windows, MD always used to create “ISO-8859 text, with CRLF line terminators” formats, and on 4072 it’s creating “Unicode text, UTF-8 text, with CRLF line terminators” format files. On Mac’s MD has always created “Unicode text, UTF-8 text” format files. What changed between 4071 and 4072? Java? Was this format encoding changed?

Anyway, I think it’s fine to create UTF8 files - always, but they need the BOM added….

So, my question is, can MD insert the BOM at the beginning of csv/text files created by MD when calling save report?

yogi1967 commented 2 years ago

https://inside.java/2021/10/04/the-default-charset-jep400/

yogi1967 commented 2 years ago

Fixed in 4073 - thanks!