Closed fepegar closed 4 years ago
yes. I seem to remember removing all these accents from the database beta version, but as I was not the only one entering data these accents re-occurred in the data and then were added to the SemioDict YAML file which causes this error.
Q) What's the best way to resolve this?
I removing the accents everywhere is easiest. Especially because people are unlikely to type them when they search for custom terms.
@neurokleos @thenineteen
The characters with accents in déjà vu and déjà vécu are causing this error in the Slicer module. Can we remove the accents?
I assumed this happens with Psychic
semiology but I can't get the error or the loggings in 3D Slicer to give me this message at the moment. Please tell me how to reproduce this error so I can go about fixing and ensuring it is fixed. @fepegar
I get this as soon as I open the module on 3D Slicer. Maybe it's because I'm using the latest Slicer version.
Python 3.6.7 (default, Aug 18 2020, 23:07:01)
[GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.11.45.5)] on darwin
>>>
Loading Slicer RC file [/Users/fernando/.slicerrc.py]
Slicer RC file loaded [27/08/2020 09:07:36]
Traceback (most recent call last):
File "/Users/fernando/git/Semiology-Visualisation-Tool/slicer/SemiologyVisualisation.py", line 89, in setup
self.logic.installRepository()
File "/Users/fernando/git/Semiology-Visualisation-Tool/slicer/SemiologyVisualisation.py", line 961, in installRepository
import mega_analysis
File "/Users/fernando/git/Semiology-Visualisation-Tool/mega_analysis/__init__.py", line 3, in <module>
from .crosstab.mega_analysis.custom_semiology_SemioDict_lookup import (
File "/Users/fernando/git/Semiology-Visualisation-Tool/mega_analysis/crosstab/mega_analysis/custom_semiology_SemioDict_lookup.py", line 11, in <module>
SemioDict = yaml.load(f, Loader=yaml.FullLoader)
File "/Applications/Slicer.app/Contents/lib/Python/lib/python3.6/site-packages/yaml/__init__.py", line 112, in load
loader = Loader(stream)
File "/Applications/Slicer.app/Contents/lib/Python/lib/python3.6/site-packages/yaml/loader.py", line 24, in __init__
Reader.__init__(self, stream)
File "/Applications/Slicer.app/Contents/lib/Python/lib/python3.6/site-packages/yaml/reader.py", line 85, in __init__
self.determine_encoding()
File "/Applications/Slicer.app/Contents/lib/Python/lib/python3.6/site-packages/yaml/reader.py", line 124, in determine_encoding
self.update_raw()
File "/Applications/Slicer.app/Contents/lib/Python/lib/python3.6/site-packages/yaml/reader.py", line 178, in update_raw
data = self.stream.read(size)
File "/Applications/Slicer.app/Contents/bin/../lib/Python/lib/python3.6/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2803: ordinal not in range(128)
Maybe utf-8
encoding can be used to read the YAML in SemioDict = yaml.load(f, Loader=yaml.FullLoader)
. But, as I said, it might be more user friendly to remove the accents as most users won't even know how to type accents in their keyboards anyway.
The reason the accents are there is to match the use of these same terms in publications in the database. Not expecting users to use the accents.
If there is a way to accurately encode these characters and match the same characters in the database without altering both the data and the dictionary, then this would be ideal.
My knowhow of UTF8 or others isn't enough to figure this out.
Are you saying if we changed it to full loader utf8, this should work?
If there is a way to accurately encode these characters and match the same characters in the database without altering both the data and the dictionary, then this would be ideal. My knowhow of UTF8 or others isn't enough to figure this out. Are you saying if we changed it to full loader utf8, this should work?
I think so. But if the user searches for deja vu
, it won't match déjà vu
. Although I guess you could modify the corresponding function so that it does.
The idea is that both with and without accent versoins of deja vu will be included in the SemioDict YAML for the predefined semiology list as it is now, but without giving these errors you've reported. I'll try utf-8
@fepegar I still can't reporoduce this error in slicer or on vs code.
Please can you send a screenshot of the error and which semiology you used (presumably Psychic
?)
Would be also helpful if I can have your versions of YAML and pandas?
replace all é with e: 65 replacements in Semio2Brain Database replace all à with a: 41 replacements in Semio2Brain Database also did the same for the two entries in SemioDict
see commit below
The characters with accents in déjà vu and déjà vécu are causing this error in the Slicer module. Can we remove the accents?