Closed leaderdog closed 6 years ago
Unfortunately goodreads just lists titles, nothing useful in the info we get back. Tells you if there is an ebook (often wrong) and tells you media is book, paperback or hardback
In lazylibrarian config, filter tab, there is Book Reject List where you can put words to reject. Books with these words in the title will get rejected, but currently only words, not phrases like "box set". If this was expanded to include phrases it might solve a lot of it? These should already work... Prologue Anthology audiobook collection
These need phrases... Short Stories graphic novel box set
Probably also need to add a switch so you can ignore parts and compilations (The Stromlight Archive #2, Part 4 of 5) (Legion, #1-3) Oathbringer (2 of 6)
Might get to look at this tomorrow.
Hi Phil,
is the book reject list for downloading or displaying in the author page?
The reason I ask is, doesn't matter what you put in there if it's just for downloading, the majority of releases wouldn't be the small parts or the anthologies etc, they'd just be the novels and some times the novellas.
What I want is a clean looking Author's page because when there is 68-117 books (like sanderson's) that are short stories, box sets, or two for one books and what have you it just makes a massive list of nothing you need. And it's very time consuming to remove/figure out what needs to be there. Just so it's easy to monitor what you have/will want.
If the book reject list does that, it will help.
Sucks that goodreads doesn't properly categorize the titles. Kinda silly really. Does google books do that? or any other library with an api? that you know of?
Reject list should do both, outright reject on downloading, option to reject or mark-as-ignored on author page. Just added reject phrases (we could almost do it anyway) and just testing sets and part books.
Brandon seems a good author to test it on, lots of sets and parts, but unfortunately just dropping "no publication date" isn't quite enough. He has lots of books with no publication date, and several have publication dates well into the future eg Elantris #3, 2022 !!
"no isbn" cuts a lot of the junk too.
My reject words list is currently audiobook, mp3, box set, graphic novel, excerpt, chapters, sampler, boxed set, bundle, book series, graphicaudio
If testing is ok you should be able to just "refresh author" and apply the new filters to what's already there.
ok, so what did I do wrong?
1) I removed Brandon Sanderson as author (had already made so many changes)
2) removed the books from the harddrive (saved on my pc)
3) added all my filter names:
audiobook, mp3, box set, graphic novel, excerpt, chapters, sampler, boxed set, bundle book series, graphicaudio, anthology, prologue, short stories, part 1, part 2
4) added author Brandon Sanderson
5) looking at everything he has 116 files
It didn't filter anything. Should there be a space after the comma? could that be messing it up?
Sanderson is an active author, he'll put excerpts of his work on there for people to read and what not, that's pretty cool that he's that involved, but I'm a let me just read the book kinda guy, as I tend to forget a lot haha.
And he has many books for his universe mapped out and of course adds them to good reads waaaay before their out as you pointed out. ;)
New version of lazylibrarian isn't out yet, still testing :-)
Spaces round commas are ok, a "phrase" can have spaces in it, but we strip leading/trailing ones. You maybe don't need part 1, part 2 in the filter, unless you want to ignore books like "The Bands of Mourning, Part 1" which might be part of a series rather than part of a book. Books like "Words of Radiance (Part 1 of 5)" get stripped out by the "part book" tickbox as they include two numbers
When the new version is out, in addition to the filter you need to tick the relevant boxes in config->importing. Seems to work well if you tick all of... Ignore books with future publication date (refresh author after this date will change status from ignored to either wanted or skipped, depending on other config settings) Ignore books with unknown publication date (incomplete info, often other bits are missing too, if goodreads improve their data, refreshing author will pick it up) Ignore books with no isbn (ditto) Ignore part books and sets (new, ignore if things in title like 1 of 3 or 1-3) Add ignored books to database (so you can manually change status on ones you don't want to ignore)
Probably won't put the new version out until tomorrow, just thought of another improvement
well that makes me feel better. I was starting to question my intellect because it didn't seem like difficult instructions to follow. :)
I currently have "ignore part books and sets" ticked, and the filter list we discussed above, and Brandon Sanderson has gone from 118 to 75. Knocking off "future publication dates" brings it down a bit more, and "no publication date" loses another 30 or so but might be too strict? New version is out, see what you think.
oooooh, that's much much better!!
I clicked everything in the list except: Ignore books with future publication date (not sure how LL refresh or looks for new novels in an authors page) removed part 1, part 2 from my original list above and I ended up with 34 books.
Some I will still have to remove, but that's far better than the huge list of random things to sort through.
Nice job. :)
I'll go through and compare what I have and what the list gave and see how close they are or if anything is missing that shouldn't be.
Quick question, when it downloads how can I get it to be green "Have" instead of "open" under status?
I noticed that 2 books/novellas got put on the ignore list, not sure why:
The Bands of Mourning (mistborn book 6) (id# 18739426) Skin Deep (Legion #2) (id# 20886354)
Otherwise I only had to remove 6 entries. So that's a major improvement!!
oh, one thing that has changed, I didn't alter my post-processing, but now the series name is showing up in brackets?
Mistborn - 06 - The Bands of Mourning (Mistborn, #6).epub
Legion - 02 - Skin Deep (Legion, #2).epub
I haven't set anything else to download yet, those two downloaded when I changed them to wanted when I moved them from the ignore list.
"Have" status can be set from the ebooks or individual author pages using the dropdown at the top of the page. "Open" status is a button set on libraryscan or after download that you can click on to open the book (download in a browser, view if your browser has a suitable plugin). The "Have" status is set manually and tells lazylibrarian not to look for the book as you've already got it, but lazylibrarian hasn't noticed.
I noticed that 2 books/novellas got put on the ignore list, not sure why:
The Bands of Mourning (mistborn book 6) (id# 18739426)
Skin Deep (Legion #2) (id# 20886354)
These don't look like novellas from the title, but if you check the log for the names you should see why they got rejected, lots of reject reasons. They are both marked "Skipped" here rather than "Ignored". Might look at whether we should keep the reject reason in the database for easier reference later, if you don't notice for a while and the log has rolled over you won't ever be able to tell unless you refresh author again.
Series name in brackets on the end might be your config name formatting settings, unless I've broken something inadvertently. Check your config eBook Filename Pattern: and the api call you used earlier to show the names
Hi Phil,
for the "Have" status, what I'm wanting is a toggle to set so it automatically sets it to Have. I don't use the browser to read books, I put them on my digital reader. So for me (I know I'm a pain) it would just be handier if it just goes straight to have after it's been post-processed.
It says no ISBN for The Bands of Mourning. weird since it's a full novel sold in stores. Same with Skin Deep.
No ISBN sure gets rid of a lot, but it does seem to miss a couple real books to, Warbreaker got rejected because of that as well. It's a full novel.
I didn't change anything in my post-processing, so not sure why, unless you see something that would account for it:
hmm, when I try this: http://172.16.1.89:5299
http://172.16.1.89:5299/api?apikey=1fd339bb18601db637e0b631f3163eb5=nameVars&id=18739426
it says incorrect API key. I generated a new one and tried again, same thing happened.
Thanks Phil!
for the "Have" status, what I'm wanting is a toggle to set so it automatically sets it to Have. I don't use the browser to read books, I put them on my digital reader. So for me (I know I'm a pain) it would just be handier if it just goes straight to have after it's been post-processed.
I might look at making it switchable
It says no ISBN for The Bands of Mourning. weird since it's a full novel sold in stores. Same with Skin Deep.
This means goodreads doesn't have an isbn for the book (it only has an amazon ASIN) and our lookup using the title with google isbn turned up nothing either - not sure why, will need to take a look...
I didn't change anything in my post-processing, so not sure why, unless you see something that would account for it:
Nothing in your config that would account for it, need to look into that too. Should have split the series info from the title on import...
Well, we don't look for an isbn anywhere except goodreads because we told it not to :-) The "ignore books with no isbn" comes before we get to the part that tries google. Fixed now.
http://172.16.1.89:5299/api?apikey=1fd339bb18601db637e0b631f3163eb5=nameVars&id=18739426 it says incorrect API key. I generated a new one and tried again, same thing happened.
You have =nameVars
at the end of the api key, should be &cmd=nameVars
Added an option to mark imports as "Have" instead of "Open", but it only works in new imports or libraryscan. To change your existing books just run libraryscan, or (faster) change the status using the dropdown on the ebook page. Show all on page, filter on open, tick the "select all" box, change status to Have
nice, that works well when downloading.
I tried doing the library scan as I have only 6 authors added at this point, but it didn't seem to do anything. Everything still has 'open' under status.
I ended process and restarted it and it still didn't change the open status to have via the run libraryscan.
Not sure why, but changing it with the other method is fine too, unless you want me to do some more testing as to why it didn't change the status?
Thanks for adding this! :)
Yes that's ok, late change to the code, we don't change existing status on a libraryscan as you might have changed it manually, just use the drop-down method.
ok sounds good.
Thanks for making the toggle option! :)
Hi Phil,
We briefly talked about this, but is there anything in the API that will help to narrow down, or filter out or remove unnecessary titles?
I Added Brandon Sanderson (Amazing author) and over 100 titles populate in his author page. Most everything seems to be working well just this needs to be addressed some how. I spent a couple hours trying to remove short stories audio books, books that are in part1 and part2 (for whatever reason they do that) and a host of other unknown items.
Does goodreads just list titles or are there categories that we can use to filter/remove/not list at all since it just clutters up the place with well, garbage?
Things I'd filter out: Prologue Anthology part1 of 2 part 2 of 2 audiobooks Short Stories graphic novels box sets collections
Essentially, I just want novels and novellas (and honestly I'm on the fence for these)
Is there anything that can be done? possibly toggles in the config to keep these from being listed in the author page?
or even tabs under the author name sorting novels from others, that way I don't have to see or deal with the others except for the option to move items over to novels if LL messed up a title?
LL is to help automate, but spending 2 hours or more (as I'm still not sure what half the stuff in Sanderson's list is - 68 titles to figure out) is rather daunting.
Thanks. :)