jeffkowalski / geeknote

Console client for Evernote.
GNU General Public License v3.0
392 stars 51 forks source link

gnsync has problem with mutated vowels #100

Open niklassemmler opened 6 years ago

niklassemmler commented 6 years ago

geeknote version: 2.0.15

Hi there,

I tried uploading a bunch of documents that contained mutated vowels (Umlaute in German) and these did not end up in evernote.

The problem is that gnsync first converts the file into ascii:

    @log
    def _get_file_content(self, path):
        """
        Get file content.
        """
        with codecs.open(path, 'r', encoding='utf-8') as f:
            content = f.read()

        # strip unprintable characters
        content = content.encode('ascii', errors='xmlcharrefreplace') # <--- HERE!
        content = Editor.textToENML(content=content, raise_ex=True, format=self.format)

And then converts it back to unicode

    @staticmethod
    def textToENML(content, raise_ex=False, format='markdown', rawmd=False):
        """
        Transform formatted text to ENML
        """

        if not isinstance(content, str): # <--- does not allow unicode
            content = ""                        # <--- same
        try:
            content = unicode(content, "utf-8") # <--- breaks mutated vowels
            # add 2 space before new line in paragraph for creating br tags
            content = re.sub(r'([^\r\n])([\r\n])([^\r\n])', r'\1  \n\3', content)
            # content = re.sub(r'\r\n', '\n', content)

Commenting these lines out solved the issue for me, but I did not dig deeper so there might be other problems now.

Cheers, Niklas