Taitava / obsidian-shellcommands

Execute system commands via hotkeys or command palette in Obsidian (https://obsidian.md). Some automated events are also supported, and execution via URI links.
GNU General Public License v3.0
365 stars 12 forks source link

Problems with non-ASCII characters (e.g. Å Ä Ö) (Windows) #5

Closed Taitava closed 3 years ago

Taitava commented 3 years ago

When current note's title is Aikajärjestys 2021, command echo {{title}} >> MyNote.md inserts the following content to MyNote.md:

Aikaj�rjestys 2021

So ä is corrupted.

OS: Windows

Taitava commented 3 years ago

Also echo Ö >> MyNote.md results in � so the issue is not only related to variables.

Taitava commented 3 years ago

My WIndows 10 uses Finnish language, so I get error messages in Finnish too (at least sometimes, depending on error). For example command cd FolderThatDoesNotExist fails like so:

kuva

It should say Määritettyä polkua ei löydy (means something like: The path was not found), so ä and ö are also corrupted in error messages.

Taitava commented 3 years ago

This Stack Overflow question has a few answers to the same problem, but I'm not certain if their solutions are solid. I'll go through them here:

Taitava commented 3 years ago

More links:

I didn't try this stuff yet.

Taitava commented 3 years ago

I tested this on Linux, and Å Ä Ö characters work ok. So it's only Windows specific, like I expected.

FelipeRearden commented 3 years ago

Hello @Taitava !!!!

I bring bad news :(

I'm writing about my scenario that's (maybe ?) related to this Issue here but I open a new if you think it's better. I think we have the same issue on MacOS for characters/letters used in the Portuguese language:

áàâã éèê íï óôõ ú ç ÁÀÂÃ ÉÈ ÍÏ ÓÔÕ Ú Ç

I found this when I was starting the tests of my tags workflow with one of my real notes (those notes has tags written in Portuguese and English) 😥

Would you mind trying in Linux a shell command with {{tags}} in a note with these tags to see if it happens with you too?

Tags:

Maçã

Açúcar

CAFÉ

Wrong Result on MacOS:

Maçã

Açúcar

CAFÉ

I create tags with all letters and I found this relations between the letters and the output from Shell Commands (I don't know if helps): Dictionary: Screen Shot 2021-10-06 at 07 04 46

Additional Information:

Testing your Example

I did the same test from your example with file_path and title

Aikajärjestys 2021 and the results was wrong too :(


I don't know if there something you can do, but I think I should let you know about this :)

Have a great day!!!!

Taitava commented 3 years ago

Thanks @FelipeRearden for your report! This task is absolutely the correct place for it. The task's title is just a bit "narrow": the issue is not just about letters Å, Ä and Ö, it's more generally about non-ASCII letters. I'll edit the title.

I will test on Linux in a moment.

Taitava commented 3 years ago

On Linux, the non-ASCII characters work ok: image (I made a temporary edit to one of my files in a test vault, so the image contains also other stuff not related to this issue).

I used the following shell command: echo "Tags: {{tags:, }}" >> TestResults.md. So the file on the left is a source file (active during execution, so {{tags}} refer to that file), and the file on the right is a destination for the shell command's output.

FelipeRearden commented 3 years ago

@Taitava !!!!!

Wait just a second I found something interesting .....

I get the same result as you :)

Screen Shot 2021-10-06 at 09 40 07

Taitava commented 3 years ago

Should I do some other kind of test on Linux?

FelipeRearden commented 3 years ago

BUT ... the problem is when we copy {{tags}} {{file_path}} to the clipboard to paste after

Look ....

Screen Shot 2021-10-06 at 09 43 16

Maybe this is the Issue we are a facing on any OS!!!!

Taitava commented 3 years ago

BUT ... the problem is when we copy {{tags}} {{file_path}} to the clipboard to paste after

Can you give the actual shell command that you are using, please? :slightly_smiling_face:

Taitava commented 3 years ago

About the Windows problem, I might have a tentative idea on how to "fix" this on Windows. But I do not have an idea on how to fix this on Mac.

The Windows fix idea is roughly something like this:

*) This code page name can be fetched via a shell command chcp. The plugin could offer a button for a user to press if they want that command to be ran by the plugin, in which case the code page setting will be filled automatically. This would only appear to Windows users.

FelipeRearden commented 3 years ago

BUT ... the problem is when we copy {{tags}} {{file_path}} to the clipboard to paste after

Can you give the actual shell command that you are using, please? 🙂

Sure!!!!

echo "tags:"["{{tags:,}}"] | pbcopy

then I just cmd+v to paste

FelipeRearden commented 3 years ago

I think I found something that could be a source of information for MacOS:

pbcopy and non-ASCII http://hints.macworld.com/article.php?story=20081231012753422

The link is broken here: Screen Shot 2021-10-06 at 10 05 11

But I found the right URL to this blog post

https://sigpipe.macromates.com/2005/clipboard-access-from-shell-utf-8/

Maybe this could help us 🙏

FelipeRearden commented 3 years ago

printf "tags:"["{{tags:,}}"] + our new output Current File:caret position is working too !!!!!!!!!!

I'm start to think that a new output copy to clipboard might be a way ❓❓❓❓❓❓❓❓❓❓

Taitava commented 3 years ago

Can you do couple more tests?

I'm start to think that a new output copy to clipboard might be a way ❓❓❓❓❓❓❓❓❓❓

Possibly yes. It just doesn't fix the root problem. It only offers a workaround. But sure, it's something I should include in the output options, after all.

FelipeRearden commented 3 years ago

In Shell commands, execute this command: echo "tags:"["{{tags:,}}"] >> SomeNote.md.

perfect !!!!

Open you Mac's normal terminal

Screen Shot 2021-10-06 at 10 46 53

Result Screen Shot 2021-10-06 at 10 47 26

Press cmd+v and nothing is on my clipboard

I'm not comfortable using Terminal, scared of breaking something

Taitava commented 3 years ago

Result Screen Shot 2021-10-06 at 10 47 26

Press cmd+v and nothing is on my clipboard

I didn't quite understand this result, so let's try to simplify the test command a bit:

Open your Mac's normal terminal and execute a shell command: echo "tags: Maçã, Açúcar, CAFÉ" | pbcopy.

I'm not comfortable using Terminal, scared of breaking something

Well, you use the same commands in terminal as in Shell plugins, so you should be scared when you use Shell commands, too. 😉 .

FelipeRearden commented 3 years ago

I'm gonna write here the terminal steps/results that I did:

[line 1] login@computer ~ % echo "tags: Maçã, Açúcar, CAFÉ"| pbcopy [line 2]login@computer ~ %

Then I hit cmd+v in Obsidian and .... tags: Maçã, Açúcar, CAFÉ !!!!!!!!!!

[before doing this is Terminal I check and my clipboard was not tags: Maçã, Açúcar, CAFÉ :) ]

FelipeRearden commented 3 years ago

Well, you use the same commands in terminal as in Shell plugins, so you should be scared when you use Shell commands, too. 😉 .

I know you are right. But for newbies Terminal is a scary place. But I get some courage and I did this time :)

I felt more comfortable with Shell Commands because I can ask your help before do something scary :)

Taitava commented 3 years ago

In Shell commands, execute this command: echo "tags:"["{{tags:,}}"] >> SomeNote.md.

perfect !!!!

So, it seems that generally, using non-ASCII characters in Shell commands seems to work on Mac. At least most of the time...

I'm gonna write here the terminal steps/results that I did:

[line 1] login@computer ~ % echo "tags: Maçã, Açúcar, CAFÉ"| pbcopy [line 2]login@computer ~ %

Then I hit cmd+v in Obsidian and .... tags: Maçã, Açúcar, CAFÉ !!!!!!!!!!

[before doing this is Terminal I check and my clipboard was not tags: Maçã, Açúcar, CAFÉ :) ]

So, pbcopy with non-ASCII characters do work when used in normal terminal. pbcopy with non-ASCII characters does not work, if used via Shell commands.

Can you still try this in Shell commands: echo "tags: {{tags:,}}" | echo "another text" >> SomeNote.md. Does it put both of the texts to SomeNote.md correctly? What i'm interested to test here, is the pipe character |, can it have some problem in Shell commands?

FelipeRearden commented 3 years ago

So, it seems that generally, using non-ASCII characters in Shell commands seems to work on Mac.

Yes! {{variables}} are working great. {{clipboard}} too. Screen Shot 2021-10-07 at 06 49 55 Screen Shot 2021-10-07 at 06 49 19

So, pbcopy with non-ASCII characters do work when used in normal terminal. pbcopy with non-ASCII characters does not work, if used via Shell commands.

Yeah, exactly !!!!

Can you still try this in Shell commands: echo "tags: {{tags:,}}" | echo "another text" >> SomeNote.md. Does it put both of the texts to SomeNote.md correctly? What i'm interested to test here, is the pipe character |, can it have some problem in Shell commands?

Sure! take a look.... just another text was transferred !!!! no pipe

Screen Shot 2021-10-07 at 05 43 56

Lets hope that the pipe is the problem 🙏

Have a great day!

FelipeRearden commented 3 years ago

@Taitava, I just wanna say I'm sorry for bring this problem yesterday.

I'm sorry if this time I brought problems and not ideas for you :)

Taitava commented 3 years ago

So, it seems that generally, using non-ASCII characters in Shell commands seems to work on Mac.

Yes! {{variables}} are working great. {{clipboard}} too. Screen Shot 2021-10-07 at 06 49 55 Screen Shot 2021-10-07 at 06 49 19

Btw that preview text is not from shell. I mean, the variable previews (both in settings and in command palette) are done completely inside the application (Obsidian + this plugin), so variable values showing up with correct letters does not mean that the letters would work when the shell command gets executed, because these previews do not use shell execution in any way.

take a look.... just another text was transferred !!!! no pipe

Screen Shot 2021-10-07 at 05 43 56

Lets hope that the pipe is the problem 🙏

Damn. The command I gave to you relied to the idea that the second echo would pass input as is to output and then add new input (= "Another text"), but instead echo discards input. My fault! I just can't come up with a better test command now.

@Taitava, I just wanna say I'm sorry for bring this problem yesterday.

I'm sorry if this time I brought problems and not ideas for you :)

Don't be sorry! We are dealing with software here, and software is full of issues, always. That's a law of nature. If you use a computer, you have problems, it can't be avoided!

What we have achieved now on the Mac side, is that we have determined the non-ASCII problem occurring for some reason when pbcopy is used. I'm not saying pbcopy has a bug - the bug may be in SC as well - but currently pbcopy seem to be involved somehow. In theory, it's possible that the Mac issue can happen with other commands too in addition to pbcopy. But for now the issue seems quite rare, so I'd say lets give it a break. We can see if other people come up with similar issues later, or if you happen to come up with new experiences regarding this in the future, then we can continue.

FelipeRearden commented 3 years ago

Don't be sorry! We are dealing with software here, and software is full of issues, always. That's a law of nature. If you use a computer, you have problems, it can't be avoided!

Thanks @Taitava :)

What we have achieved now on the Mac side, is that we have determined the non-ASCII problem occurring for some reason when pbcopy is used.

It is bad that we don't have a | pbcopy version on Win and Linux. This way you could know if is something related to transfer content to the clipboard by SC or not.

But for now the issue seems quite rare, so I'd say lets give it a break. We can see if other people come up with similar issues later, or if you happen to come up with new experiences regarding this in the future, then we can continue.

100% agree. The important thing is that we found out something to be aware when creating Shell Commands for complex workflows.

I wish you a fantastic day!

Taitava commented 3 years ago

It is bad that we don't have a | pbcopy version on Win and Linux. This way you could know if is something related to transfer content to the clipboard by SC or not.

Even if we did have, it would be a different program by it's cource code and logic, so it might be it would not have the same problems (or features) as Mac's pbcopy program.

I wish you a fantastic day!

Same to you! 🙂

Taitava commented 3 years ago

I removed Mac from the title and labels, to focus again more on Windows.

FelipeRearden commented 3 years ago

OK @Taitava !!! Good luck, I hope you find a solution for Windows 🙏

Taitava commented 3 years ago

Regarding the Windows problem, I've concluded the following:

Below I will go through my old post that listed some "possible solutions".

  • Answer by BladeMight (accepted by the asker) Suggests preceding the shell command with cmd /c chcp 65001>nul && when OS is Windows. Seems to work for people, but it's also mentioned that the charset (chcp 65001) may be different in each language version of Windows. Also, adding this kind of preliminary commands feel a bit quirky to me. Yet, I don't feel comfortable with adding extra comands that a user cannot see. A user thinks they are executing command X, but actually what gets executed, is Y and X.

Even though I first hesitated to use this, I finally tried it. It does not work. CMD's output still uses some other encoding than UTF-8.

  • Answer by Zhang Buzz Suggests converting the command's encoding to cp936, which works for Simplified Chinese, but Zhang is actually the one who warns that the encoding varies for different languages, and that a correct encoding should be checked from Microsoft's website (no link provided). I like this answer more than the above. It does not inject additional shell commands to be executed. But it still has the problem, that I cannot be certain which encoding should be used for the conversion.

I tried a lot to do conversions to the command encoding. I was not even able to get iconv-lite imported into SC, so my conversion attempts failed badly. But then again, it does not matter, because as I now know, the problem is not in the encoding of the shell command, it's about getting CMD to use UTF-8 when it outputs stuff - be it output to a file or output back to Obsidian/SC. So, ditch all encoding conversion solutions in the application's side - they are hard to implement, and even if the implementation would succeed, they could not affect CMD's output to files. CMD's output to Obsidian/SC could be converted to UTF-8 (in theory), but it would not affect output to file directly from CMD.

  • Answer by Paul Verest Suggests to use Node.js's child_process.exec() method's second argument for providing an object of settings, with one setting being encoding. This still needs the encoding to be figured out. I played around with this setting a couple of weeks ago (I found it myself, too), but I was not able to fix anything, as I didn't realise that I should not use any utf* encodings here, but instead some Windows encodings. So far it's the simplest solution, but does not solve the problem completely.

This encoding only means to what encoding Node.js will convert output that comes back to Obsidian/SC. It does not affect CMD in any way. By default, Node.js converts the output to UTF-8. Or actually: it claims to do so. But still the output can be garbage like instead of Ö. This option could be used to convert the output to e.g. UTF-16 LE, binary, base64, or buffer, among some other things. I do not know where I would use these. But clearly, this option is not a solution to this problem.

My conclusion is that on Windows, SC should start to lean towards PowerShell. PowerShell also has these encoding problems, but as it's newer technology (not new, but newer than CMD), I guess it won't be so big of a can of worms as CMD is. There should be a way to make PowerShell live and breath UTF-8 instead of anything else.

I'm not banishing CMD completely from SC. I'm not yet familiar with PowerShell, but AFAIK, PowerShell is not totally backwards compatible with CMD, so users might have CMD commands that they are not able to execute on PowerShell without changes. So there will be an option to choose between CMD and PowerShell on Windows. Actually, there will be an ability to pick a shell freely. Linux and Mac have also other shell options than Bash. But I will create a new discussion regarding this shell selection feature. There's so much to think that I can't write it now, I need to get it clear in my head first.

So CMD will be around also in the future, but with two practical notes:

I have decided not to inspect this issue anymore. I trust that the ability to use PowerShell in the future will bury this source of misery for good. Therefore, I'm confident to close this issue now and mark it as wontfix.

I feel terrible about writing this long pile of text, because people might end up reading it.

FelipeRearden commented 3 years ago

@Taitava !!!!!

Great news: for MacOS, the new clipboard output was able to "fix" the Issue that we had with | pbcopyon MacOS. #68

I'm writing here in case we forget in the future :)

Taitava commented 3 years ago

@FelipeRearden nice to hear! 🙂

Taitava commented 3 years ago

I'm now testing the character encoding things with Windows PowerShell 5 and PowerShell Core. Powershell Core is newer than PowerShell 5.

Core seems to be working fine with å ä ö letters. But PowerShell 5 uses UTF-16 encoding when outputting to a file with >/>>, which makes all output characters messy. But when output is given to whatever SC's output channel, PowerShell 5 uses UTF-8 and all characters are fine.

I just leave this comment as a record here. I'm not certain if I can provide any solution for this charset inconsistency in PowerShell 5. I mean, there are some, but I feel those do not belong under the umbrella of this plugin, I feel they belong to the userland, at least at the moment.

twibiral commented 1 year ago

Hi everyone, I'm coming from the Obsidian Execute Code Plugin where we had a nearly identical problem (See twibiral/obsidian-execute-code#200).

If your problem still persists for PowerShell on Windows (and you even still care about it at this point) I can maybe help you: The problem was that PowerShell still uses windows-1252 as the default encoding on windows for legacy reasons. NodeJS and JavaScript use utf-8 and utf-16 (all internal JS strings are utf-8). Unfortunately, JS does not support windows-1252 out-of-the-box. But in most cases it is enough to just use latin1 encoding. So every time you call a child process or especially write a file, try to encode everything with latin1 and it will probably work.

Cheers!

Taitava commented 1 year ago

Thank you @twibiral ! Yes, the problem still exists and yes, I'm interested in exploring solutions for it 🙂 . I'll try latin1 and let you know here if I'll succeed. 👍

Taitava commented 1 year ago

@twibiral now that I took a look at this and tested without modifications, seems that PowerShell Core works ok (pwsh.exe), but PowerShell 5 has issues (powershell.exe). I re-read this issue now and remember this talked about CMD.EXE, not so much about PowerShell.

I tried the following conversion, but it didn't change anything for PowerShell 5, nor for CMD.EXE:

const convertedShellCommand = transcode(Buffer.from(originalShellCommand), "utf8", "latin1").toString("latin1");

I don't know if my conversion logic fails (didn't have yet time to test if it actually changes the string) or if it's actually a correct solution, but to a wrong problem.

I know the JS line I posted is not a full example. I was using it in this plugin's code, where execution logic is split over multiple files, so I can't post a meaningful full example just now. I'll need to write a simple code sample that can better isolate the problem from the rest of the code.

@twibiral did you have the problem with PowerShell 5 or PowerShell Core, or both?