Closed carribeiro closed 2 years ago
Updated the title, seems to me that cq-editor is not saving the files using UTF-8 (which I believe it should be doing).
Trying to figure out what's going on. The save command is implemented on editor.py (https://github.com/CadQuery/CQ-editor/blob/master/cq_editor/widgets/editor.py), line 162:
with open(self._filename,'w') as f:
f.write(self.toPlainText())
Seems that self.toPlainText()
doesn't generate the correct Unicode representation for file save. The code comes from Spyder, and I'm still trying to figure out which would be the correct way of saving files using Spyder's API (the code base is a lot bigger and I'm still trying to understand how the editor is structured).
I cannot reproduce this on linux. Could you report what you get in the console when executing this:
self.components['editor'].toPlainText()
@carribeiro what does import locale; locale.getpreferredencoding()
produce? This should be the encoding used by open
.
I cannot reproduce it either on OSX. File is saved as UTF-8:
(cq) MacBook-Pro:cadquery cribeiro$ file *
test3.py: UTF-8 Unicode text
The test suggested above on OSX returns the following:
In[1]: self.components['editor'].toPlainText()
Out[1]: 's = "atenção"\n'
The test also returns a similar result on Windows:
In[5]: self.components['editor'].toPlainText()
Out[5]: 's = "atenção"\n'
The problem may be related to the way the file is opened for saving; perhaps it's not opening as UTF-8 as it should on Windows (something like open(filename, 'r', encoding='utf8')
- but obviously I'm oversimplifying).
@carribeiro what does
import locale; locale.getpreferredencoding()
produce? This should be the encoding used byopen
.
I think you nailed it.
import locale; locale.getpreferredencoding()
Out[6]: 'cp1252'
However, neither IDLE or VSC use cp1252; both seem to default to UTF-8 anyway. I guess that's exactly to guarantee interoperability.
Did a quick search and it seems that it's recommended to always save files using UTF-8. Not a definitive source though, just some discussions on developer forums. Maybe someone on the main Python groups could answer that authoritatively.
One discussion that I've found: http://python-notes.curiousefficiency.org/en/latest/python3/text_file_processing.html (it's pretty old but it goes over a lot of the issues on the transition of Unicode encoding in Python and strategies for compatibility)
Well, I'm using the current Python default. Can't you configure Windows to use utf8 in the locale?
I'm still trying to figure out exactly what is happening, but here's a start. I live in Brazil and we use non-ASCII characters in the source files. One such example is the word "Atenção" which means "Attention".
When I create the file above in cq-editor, it is saved as an ISO-8859 text file. If I try to open in in Visual Studio Code, the characters are garbled:
If I create the same file directly on VSC, it shows up fine:
Checking the files on the filesystem, the encoding is different:
The file size is also different:
Now, if I try to open the file that it was created VSC on cq-editor, it opens... but as soon as I save it on cq-editor, the unicode chars are garbled again.
Now, if I set cq-editor to autoreload, and edit the file on VSC, it shows up fine on cq-editor (including the non-ASCII characters). But as soon as I save it on cq-editor, it is mangled on VSC; and if I save the file with the garbled characters on VSC, then it breaks the characters on cq-editor.