Closed j-chan-hkust closed 3 years ago
This one got me one step further indeed. Great tip!
Now I am running into the following error. Did you encounter the same by chance?
line 186, in addToNotion title = "Page: " + lastClip['Page'] + "\tDate Added: " + str(lastClip['Date Added'].strftime("%A, %d %B %Y %I:%M:%S %p")) TypeError: can only concatenate str (not "NoneType") to str
Edit: Nevermind - found this pull request - works great 🎉 https://github.com/paperboi/kindle2notion/pull/12
As said before, date extraction is hardcoded and no sanity check is done before dealing with it.
For French Kindle format is something like Ajouté le dimanche 18 août 2019 22:36:55
.
In that case you have to replace strptime
format arround line 63
with something like
dateAdded = datetime.strptime(addedOn, ' Ajouté le %A %d %B %Y %H:%M:%S')
But it's not sufficient. In french you've got some accents.
And unidecode normalize args don't handle that correctly, using the D form.
I suggest to replace to C form with allClippings = unicodedata.normalize("NFKC", allClippings)
arround line 26
.
Sorry for the late reply! Will implement a better date parser in the next commit!
@emmanuelbbc can you provide me more context on the formatting in French Kindle?
Hi @paperboi.
On French Kindle, lines are formated like this :
- Votre surlignement à lʼemplacement 102-103 | Ajouté le dimanche 18 août 2019 22:36:55
You have :
%A %d %B %Y %H:%M:%S
fr_FR
with locale.setlocale(locale.LC_ALL, "fr_FR")
@emmanuelbbc thanks! Will keep it in mind. @philffm thanks for raising the issue! Hope it's working like a charm now!
@emmanuelbbc localization is on the pipeline - will need to do some more reading on it before I tweak the code. Solved the conversion issue latest in the latest commit. Please do feel free to add anything regarding this particular issue on a new specific issue thread. Might need some help from you as a tester in the future too! Regards!
@j-chan-hkust also thanks for raising the issue in the first place! Hope the current commit works like a charm for ya!
Date extraction in line 52 assumes the formatting of the date string. dateAdded = datetime.strptime(addedOn, ' Added on %A, %d %B %Y %X') Thursday, 18 April 2019 21:12:17
My kindle is actually formatted as: e.g. Thursday, April 18, 2019 9:12:17 PM
The fix is actually not that hard to implement, change line 52 to: dateAdded = datetime.strptime(addedOn, ' Added on %A, %B %d, %Y %X %p')
However, may be worthwhile to sanity check the formatting of dates in the clipping object, or to use a more robust dateparser. After stripping out the "Added on " out of the string. An example: https://dateparser.readthedocs.io/en/latest/