Open fabianski7 opened 4 years ago
By books, do you mean non-scientific references?
As of now, the script targets http://gen.lib.rus.ec/scimag/ (scientific references only) which may explain why you're not able to recover some references with it.
yes. books from http://gen.lib.rus.ec
Can you give me some examples of references you can't retrieve through http://gen.lib.rus.ec/scimag/ ? I'll take a look at how I can implement it
The download links follow a specific pattern. See these examples
ip/main/?/hashmd5
http://93.174.95.29/main/870000/2e0f494c2a31ba864891ff21e2625b9a/
http://93.174.95.29/main/870000/ce62a967106ba1695ff5065f2c8735a9/
http://93.174.95.29/main/870000/53e6a9293463cbebac96a8c34fe28994/
http://93.174.95.29/main/870000/50528fb9f8ad4def9b1efc53ba9f7df5/
at the end of the url, any name can be added, it will be the name of the file.
http://93.174.95.29/main/870000/50528fb9f8ad4def9b1efc53ba9f7df5/somebook.mobi
The number I represented with the "?" I don't know how it's obtained. I tried to look at the database dump but it wasn't present in any table. I only managed to import the dump from the compact version; the full version is over 12gb and my computer couldn't handle it lol
There are other python scripts here at github that also download the books at http://gen.lib.rus.ec. Perhaps they can serve as a reference. Like the https://github.com/adolfosilva/libgen.py and that api https://github.com/mmarquezs/libgen-python-api There are several others like that.
thank you for your attention
That shouldn't be difficult to implement.
I just need to figure out how the requests are made on http://gen.lib.rus.ec/ for the books you're talking about.
For that purpose, I need you to tell me how you're searching for these books on http://gen.lib.rus.ec/ (ISBN for instance ?)
I use a list of hash md5 that I exported from the database dump.
But the site also supports searching for other terms, like the ISBN you quoted. I think the hash search is more appropriate, because every book that is there is identified by the hash.
I'll probably figure out an ISBN search method, as when looking for downloading a book, we retrieve the ISBN and not its linked hash on http://gen.lib.rus.ec/
What do you think about it?
It might be a good option.
At the moment I don't have an ISBN list available, but I remember seeing some books that didn't have this identification and some that had more than one unique identifier.
Would it be possible to add support for http://gen.lib.rus.ec/? Many books are not available at libgen.lc mirror