ArjenLammers / csv-connector

Mendix App Store CSV Connector
MIT License
0 stars 2 forks source link

Export Action should accept encoding specifications. #5

Closed youtasuzuki closed 2 years ago

youtasuzuki commented 2 years ago

I think Export Actions should accept encoding specifications. for now, they cannot be used in encodings other than the default encoding. And that is actually big problem here Asia, because there is not only one encoding but some encodings in some Asian each countries. My idea is to add charsetName(encodingName) string parameter to Export action and use it for constructor of FileOutputStream class. As Is: 'CSVWriter writer = new CSVWriter(new FileWriter(tmpFile), ' To Be: 'CSVWriter writer = new CSVWriter(new BufferedWriter(new OutputStreamWriter(new FileOutputStream(tmpFile), charsetName)), ' For compatibility, it's a good idea to add a new Action like ExportCSVwithCharset.

ArjenLammers commented 2 years ago

Hi youtasuzuki, Thanks for this feature request!

Could you perhaps provide a project (File -> Export App Package) and a database (Version Control -> Add snapshot of data) and the data-snapshot.zip from the project's directory? It would really help me because I'm not that familiar with the other character sets and the expected outcomes and want to embed it into the test project.

Kind regards, Arjen Lammers

youtasuzuki commented 2 years ago

Hi Arjen-san, The following page explains the complicated encoding problem in Japan well. https://www.dampfkraft.com/mojibake-field-guide.html

In the current implementation, the CSV module generates a UTF-8 CSV file, but for example MS-Excel expects the CSV file to be in Shift JIS encoding, so if you read it with MS-Excel, Mojibake will occur. Therefore, for that purpose, the CSV module needs to generate a CSV file with Shift JIS encoding.

This problem does not occur in the system that uses the CSV module alone, but in combination with the system that uses the output CSV. We have to generate the CSV file in the encoding that the system that uses the CSV file expects. The CSV file generated by the CSV module is not corrupted, but if the encoding is not what the reader expects, it will be corrupted when reading.

ArjenLammers commented 2 years ago

Hi youtasuzuki,

I've published version 1.9 which should be able to handle this case. It should work when you supply SJIS as character set name.

Kind regards, Arjen Lammers

youtasuzuki commented 2 years ago

Hi Arjen-san,

Thank you for your support. But charcterSet is never used in ExportOQLToCSV and ExportSQLToCSV. Could you please confirm that?

Best regards, Yuta Suzuki.

ArjenLammers commented 2 years ago

Hi youtasuzuki,

I fixed it in 1.9.1.

Kind regards, Arjen Lammers

youtasuzuki commented 2 years ago

Hi Arjen-san,

Thank you for your kind support. Now, we can use it in multi CharacterSet environment!

Best regards, Yuta Suzuki.