Closed jhellerstedt closed 6 years ago
ok this is happening because we're pouring Czech language nonsense into the comment box inadvertently, and nanonis will spit out ISO-8859 characters
for ii in os.listdir(os.getcwd()):
if ii.endswith(".dat"):
f = open(ii, 'rb')
file = f.read()
try:
file = file.decode().encode('utf-8')
except:
file = file.decode('latin-1').encode('utf-8')
f.close()
f = open(ii, 'wb')
f.write(file)
f.close()
Doing this before calling nanonispy.read.Spec fixes my problem, not sure if there's a smarter way to incorporate this into your read function.
Hi,
Sorry for the lack of a reply, I've been away for quite a while. I'll take a look at fixing it more generally, do you have a sample file that reproduces it reliably? I can try to recreate it myself using what you described but if there's something you know triggers it it'd be easier.
Cheers
No worries- here's a dbox link to a spectroscopy file that throws the error: https://www.dropbox.com/s/va14wi26gplsfam/Z-Spectroscopy001.dat?dl=0
You could probably punt on this problem on the grounds that its a Specs/Nanonis issue with their software not being utf-8 compatible, but maybe there's a way to integrate the fix I mentioned above into your scheme.
just letting you know I pushed and merge #7 what I think is a good fix for reading in and handling non utf-8 characters, let me know what you think.
//anaconda/lib/python3.5/site-packages/nanonispy/read.py in start_byte(self) 112 for line in f: 113 # Convert from bytes to str --> 114 print(line) 115 entry = line.strip().decode() 116 if tag in entry:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xec in position 10: invalid continuation byte
Hi,
I encounter this ~semi-regularly, but can't reliably reproduce it unfortunately. Sometimes adding an 'ignore' in the decode() works, sometimes not.