paperboi / kindle2notion

Export all clippings from your Kindle device to a database in Notion.
https://pypi.org/project/kindle2notion/
MIT License
903 stars 120 forks source link

Export kindle clipping with Chapter name. #45

Open hitamu opened 3 years ago

hitamu commented 3 years ago

Hi, I don't really know where to ask but if someone know the way to export highlight with chapter name it's really helpful for later review (For now we only have location in book)

hitamu commented 3 years ago

One idea is highlight chapter name as well, so we may need to parse it (or mark it as chapter name somehow - adding text to note)

paperboi commented 3 years ago

Hi @hitamu! Thanks for bringing this up! As far I know, automatically fetching the chapter titles from a book would be impossible to implement because chapter information is not stored within MyClippings.txt or in any other meta-data doc within the Kindle. Bouncing off your idea, the purpose here is to create a tag of each heading and create a structure of the book as a whole that our program can fetch when needed. My proposal for implementing this is as follows:

Basically, you'd have to make a note of each chapter and sub-chapter titles at the beginning in this given format (.h1, .h1.1, .h1.1.1 and so on). The code can capture this when browsing through the My Clippings.txt file and store this information. It would be ideal to do this 'tagging' before you start reading a book so that it wouldn't come in the way of your reading experience and ruin immersion. A addToTableOfContents function within parsing.py can be called if the content of a note fits this tag format and the highlighted text can be added to a Table of contents dictionary structure (along with page/location information) to keep track of which chapter the user is on when browsing through other clippings. A fetchFromTableOfContents function can be written within parsing.py to fetch the chapter that a Clipping is taken from while parsing other clippings.

I'm sure this would be doable in its naive form but there may some other necessary checks involved that can be fleshed out in the coding phase. I can write a function to the current codebase to implement this idea but it'll take me some time to test it out. I'll work on a solution and leave this issue open to other contributors for help with testing and speeding up the process. I think it's a pretty neat enhancement and it's not too complex to implement. Thank you for raising this issue and sending me on this particular rabbit hole. Will post updates here over the next week or two regarding this fix as I figure this out.