paperboi / kindle2notion

Export all clippings from your Kindle device to a database in Notion.
https://pypi.org/project/kindle2notion/
MIT License
904 stars 120 forks source link

Date conversion is not handled properly #10

Closed j-chan-hkust closed 3 years ago

j-chan-hkust commented 3 years ago

Date extraction in line 52 assumes the formatting of the date string. dateAdded = datetime.strptime(addedOn, ' Added on %A, %d %B %Y %X') Thursday, 18 April 2019 21:12:17

My kindle is actually formatted as: e.g. Thursday, April 18, 2019 9:12:17 PM

The fix is actually not that hard to implement, change line 52 to: dateAdded = datetime.strptime(addedOn, ' Added on %A, %B %d, %Y %X %p')

However, may be worthwhile to sanity check the formatting of dates in the clipping object, or to use a more robust dateparser. After stripping out the "Added on " out of the string. An example: https://dateparser.readthedocs.io/en/latest/

philffm commented 3 years ago

This one got me one step further indeed. Great tip!

Now I am running into the following error. Did you encounter the same by chance? line 186, in addToNotion title = "Page: " + lastClip['Page'] + "\tDate Added: " + str(lastClip['Date Added'].strftime("%A, %d %B %Y %I:%M:%S %p")) TypeError: can only concatenate str (not "NoneType") to str

Edit: Nevermind - found this pull request - works great 🎉 https://github.com/paperboi/kindle2notion/pull/12

emmanuelbbc commented 3 years ago

As said before, date extraction is hardcoded and no sanity check is done before dealing with it.

For French Kindle format is something like Ajouté le dimanche 18 août 2019 22:36:55.

In that case you have to replace strptime format arround line 63 with something like dateAdded = datetime.strptime(addedOn, ' Ajouté le %A %d %B %Y %H:%M:%S')

But it's not sufficient. In french you've got some accents.

And unidecode normalize args don't handle that correctly, using the D form. I suggest to replace to C form with allClippings = unicodedata.normalize("NFKC", allClippings) arround line 26.

paperboi commented 3 years ago

Sorry for the late reply! Will implement a better date parser in the next commit!

@emmanuelbbc can you provide me more context on the formatting in French Kindle?

emmanuelbbc commented 3 years ago

Hi @paperboi.

On French Kindle, lines are formated like this : - Votre surlignement à lʼemplacement 102-103 | Ajouté le dimanche 18 août 2019 22:36:55

You have :

paperboi commented 3 years ago

@emmanuelbbc thanks! Will keep it in mind. @philffm thanks for raising the issue! Hope it's working like a charm now!

paperboi commented 3 years ago

@emmanuelbbc localization is on the pipeline - will need to do some more reading on it before I tweak the code. Solved the conversion issue latest in the latest commit. Please do feel free to add anything regarding this particular issue on a new specific issue thread. Might need some help from you as a tester in the future too! Regards!

paperboi commented 3 years ago

@j-chan-hkust also thanks for raising the issue in the first place! Hope the current commit works like a charm for ya!