Closed JimmXinu closed 3 months ago
For what stories does this happen and region? I have yet to observe the changes. Without using chromium proxies everything shows up as expected besides the pages
which i have fixed on my branch. The only time i saw adapter erroring out on follows
, numCollections
, collections
, numAwards
and awards
was when using a proxy which executed the code that was being scraped. So now when using the proxy like nsapa that meta will not be collected, is that ok?
Sorry, I meant to link the original report. I don't know the posters region, but I'm trying from Midwest USA.
I tested with Flaresolverr v3.3.16 (the version I already had installed) after getting a 403 Forbidden directly, and the story URL given.
But I see what you mean--it works for me using Browser Cache instead.
In an ideal world, site specific metadata collection would work the both with and without proxy. I don't read that site, so you can decide if it would best to not have those metadata entries at all, or not collect them when using proxy.
For now to make it work i would just push the changes as is, as far as i tested, it works. The problem is for example to scrape awards
, when using proxy, it would require a user or a script to click 'load more' as many times as necessary to get every award and i have no idea how to do that. Next week i can try to make follows
, numCollections
, numAwards
work for proxies.
The metadata collection now works with proxies. Everything is collected besides the awards
if using proxies. I have tested the adapter on handful of stories and everything is looking fine. I just do not know how to test it if it works in python2. After that it should be good to go.
I test python2 by using Portable Calibre v2.85.1 with the plugin version.
I'll wait until you tell me you're done with #1077 before I merge it.
@dbhmw,
Looks like ficbook.net has changed. Collection of
follows
, collections andnumAwards
at least are now erroring out. I stopped looking at that point.Do you have the time and interest to look into fixing it?