Closed 522848942 closed 1 month ago
Can you tell us a bit more about this? What is the specific Chinese system you're using (simplified, traditional)? I don't really know encodings other than 'ASCII' or "utf-8". Is this a specific way of encoding Chinese? We definitely have users that have Chinese in their code and it seems work for them so it would be helpful if we could debug this a bit more.
I think this is an https://git.bsse.ethz.ch/hima_public/HDsort problem, isn't it?
Are you using the latest version of spikeinterface? I could not find the lines where you error came from.
I think i‘m using the latest version of spikeinterface.
I follow the tutorial in the following pages to install spikeinterface.
https://spikeinterface.readthedocs.io/en/latest/get_started/installation.html
pip install spikeinterface[full,widgets]
And when i run print("Installed sorters", ss.installed_sorters())
is problem appear.
And this is my system:
Here is the file which i add encoding='gbk'
can solve the problem:
E:\anaconda\envs\spike\Lib\site-packages\spikeinterface\sorters\utils\shellscript.py
I copy all code in shellscript.py in the following txt: shellscript.txt
And here is the place i add encoding='gbk'
:
I think this problem arises because _process.stdout here is sometimes in Chinese such as : ('#!' 不是内部或外部命令,也不是可运行的程序或批处理文件。此时不应有 [。 ) in chinese system which means : ('#!' is not recognized as an internal or external command,operable program or batch file.)in english system
and line cannot be decoded(Line 93)
So based on my reading it seems like Windows uses gbk
encoding for simplified Chinese which is causing this problem. I guess the easiest solutions are either you use a private fork with gbk
encoding or you use English for the script. Why are you using the shebang in general? Or is that from our code that is trying to use the shebang? Since utf-8
seems to work for most systems I'm not sure what the best solution is here. @alejoe91 do you have any ideas for checking necessary encoding for shell scripting with different encoding systems? Or do we want to enforce utf-8
and have people make their own forks with gbk
?
Could we switch automatically in case a Chinese system is detected?
That's what I was wondering. But I don't know how to do that? I guess we could do a try-except before the shell script and try to decode something and if it fails switch to gbk. But I don't know what we could check? It isn't the OS itself, it's just the typing in terminal, so without forcing a config file I'm not sure.
I get a similar error in a system with a Japanese Windows installation. Running:
ss.installed_sorters()
Returns:
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
Cell In[5], [line 7](vscode-notebook-cell:?execution_count=5&line=7)
[5](vscode-notebook-cell:?execution_count=5&line=5) analyzer_obj = si.load_sorting_analyzer(analyzer_fp, format='binary_folder')
[6](vscode-notebook-cell:?execution_count=5&line=6) else:
----> [7](vscode-notebook-cell:?execution_count=5&line=7) use_docker = False if sorter in ss.installed_sorters() else True
[8](vscode-notebook-cell:?execution_count=5&line=8) sorter_fp = os.path.join(output_path, f'{sorter}_output')
[9](vscode-notebook-cell:?execution_count=5&line=9) if grouped_sorting:
File c:\Users\system-ses\anaconda3\envs\sci_env\Lib\site-packages\spikeinterface\sorters\sorterlist.py:65, in installed_sorters()
[62](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/sorterlist.py:62) def installed_sorters():
[63](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/sorterlist.py:63) """Lists installed sorters."""
---> [65](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/sorterlist.py:65) return sorted([s.sorter_name for s in sorter_full_list if s.is_installed()])
File c:\Users\system-ses\anaconda3\envs\sci_env\Lib\site-packages\spikeinterface\sorters\sorterlist.py:65, in <listcomp>(.0)
[62](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/sorterlist.py:62) def installed_sorters():
[63](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/sorterlist.py:63) """Lists installed sorters."""
---> [65](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/sorterlist.py:65) return sorted([s.sorter_name for s in sorter_full_list if s.is_installed()])
File c:\Users\system-ses\anaconda3\envs\sci_env\Lib\site-packages\spikeinterface\sorters\external\hdsort.py:92, in HDSortSorter.is_installed(cls)
[90](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/external/hdsort.py:90) @classmethod
[91](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/external/hdsort.py:91) def is_installed(cls):
---> [92](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/external/hdsort.py:92) if cls.check_compiled():
[93](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/external/hdsort.py:93) return True
[94](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/external/hdsort.py:94) return check_if_installed(cls.hdsort_path)
File c:\Users\system-ses\anaconda3\envs\sci_env\Lib\site-packages\spikeinterface\sorters\basesorter.py:375, in BaseSorter.check_compiled(cls)
[367](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/basesorter.py:367) shell_cmd = f"""
[368](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/basesorter.py:368) #!/bin/bash
[369](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/basesorter.py:369) if ! [ -x "$(command -v {cls.compiled_name})" ]; then
(...)
[372](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/basesorter.py:372) fi
[373](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/basesorter.py:373) """
[374](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/basesorter.py:374) shell_script = ShellScript(shell_cmd)
--> [375](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/basesorter.py:375) shell_script.start()
[376](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/basesorter.py:376) retcode = shell_script.wait()
[377](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/basesorter.py:377) if retcode != 0:
File c:\Users\system-ses\anaconda3\envs\sci_env\Lib\site-packages\spikeinterface\sorters\utils\shellscript.py:93, in ShellScript.start(self)
[89](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/utils/shellscript.py:89) self._process = subprocess.Popen(
[90](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/utils/shellscript.py:90) cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, bufsize=1, universal_newlines=True
[91](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/utils/shellscript.py:91) )
[92](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/utils/shellscript.py:92) with open(script_log_path, "w+") as script_log_file:
---> [93](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/utils/shellscript.py:93) for line in self._process.stdout:
[94](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/utils/shellscript.py:94) script_log_file.write(line)
[95](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/utils/shellscript.py:95) if (
[96](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/utils/shellscript.py:96) self._verbose
[97](file:///C:/Users/system-ses/anaconda3/envs/sci_env/Lib/site-packages/spikeinterface/sorters/utils/shellscript.py:97) ): # Print onto console depending on the verbose property passed on from the sorter class
File <frozen codecs>:322, in decode(self, input, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x82 in position 5: invalid start byte
This was another concern I had. I bet a lot of character based systems will have their own encoding. I mean at this point we could expose this as a run_sorter kwarg with a default to utf-8
and then users can switch to their own encoding if needed? What do you think @alejoe91 ?
Although that doesn't help with this is_installed
... maybe even a global kwarg then? not sure where it would best go.
actually, this might just work! #3439
When running the following code, if the Chinese system is used, the following error will be reported
UnicodeDecodeError Traceback (most recent call last) Cell In[3], line 2 1 print("Available sorters", ss.available_sorters()) ----> 2 print("Installed sorters", ss.installed_sorters())
File e:\anaconda\envs\spike\lib\site-packages\spikeinterface\sorters\sorterlist.py:65, in installed_sorters() 62 def installed_sorters(): 63 """Lists installed sorters.""" ---> 65 return sorted([s.sorter_name for s in sorter_full_list if s.is_installed()])
File e:\anaconda\envs\spike\lib\site-packages\spikeinterface\sorters\sorterlist.py:65, in(.0)
62 def installed_sorters():
63 """Lists installed sorters."""
---> 65 return sorted([s.sorter_name for s in sorter_full_list if s.is_installed()])
File e:\anaconda\envs\spike\lib\site-packages\spikeinterface\sorters\external\hdsort.py:92, in HDSortSorter.is_installed(cls) 90 @classmethod 91 def is_installed(cls): ---> 92 if cls.check_compiled(): 93 return True 94 return check_if_installed(cls.hdsort_path)
File e:\anaconda\envs\spike\lib\site-packages\spikeinterface\sorters\basesorter.py:375, in BaseSorter.check_compiled(cls) 367 shell_cmd = f""" ... --> 322 (result, consumed) = self._buffer_decode(data, self.errors, final) 323 # keep undecoded input until the next call 324 self.buffer = data[consumed:]
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb2 in position 5: invalid start byte Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...
I solved this problem by adding the following code to shellscript.py. Is there a better solution? add
encoding='gbk'
when using subprocess.Popen