This is probably a pretty big ticket. We essentially need to find all of the ebooks in a folder specified by a configuration value. For each item we find, we need to index it in SQLite.
Here's an example structure so you can see what we're looking for and what kind of data we're looking to pull out.
M:
| > comics
| | > Marvel (this is a publisher)
| | | > Iron Man (this is a comic series)
| | | | > 44 (this is the series number)
| | | | | > iron_man_44.pdf (whatever the name of the file is)
| | | | | > iron_man_44_cover.jpg (the cover image for this comic)
| > books
| | > Henry James (this is an author)
| | | > The Portrait of a Lady (this is a book)
| | | | > portrait.epub (whatever the name of the file is)
| | | | > portrait_cover.jpg (the cover image for this book)
If we were to walk the filetree presented above, we would do the following (keep in mind we're only supporting books at the moment):
Create a new author named "Henry James" (unless that author already exists).
Create a new book titled The Portrait of a Lady (again, unless it already exists from a previous crawl)
That book should have it's filepath value set to the file path at which we have last indexed the book.
You can ignore the cover for now. We'll hit that up in another iteration.
Since the value of which folder we index is intended to be configurable, we should use this as an opportunity to take in an env var for that value.
This is probably a pretty big ticket. We essentially need to find all of the ebooks in a folder specified by a configuration value. For each item we find, we need to index it in SQLite.
Here's an example structure so you can see what we're looking for and what kind of data we're looking to pull out.
If we were to walk the filetree presented above, we would do the following (keep in mind we're only supporting books at the moment):
Since the value of which folder we index is intended to be configurable, we should use this as an opportunity to take in an env var for that value.