parisbastien / GATHERLSEVIER

GATHERLSEVIER allows you to instant-download (no captcha or web surfing) the articles you want from libgen and sci-hub.
4 stars 1 forks source link

LibGen #1

Open fabianski7 opened 4 years ago

fabianski7 commented 4 years ago

Would it be possible to add support for http://gen.lib.rus.ec/? Many books are not available at libgen.lc mirror

parisbastien commented 4 years ago

By books, do you mean non-scientific references?

As of now, the script targets http://gen.lib.rus.ec/scimag/ (scientific references only) which may explain why you're not able to recover some references with it.

fabianski7 commented 4 years ago

yes. books from http://gen.lib.rus.ec

parisbastien commented 4 years ago

Can you give me some examples of references you can't retrieve through http://gen.lib.rus.ec/scimag/ ? I'll take a look at how I can implement it

fabianski7 commented 4 years ago

The download links follow a specific pattern. See these examples

ip/main/?/hashmd5

http://93.174.95.29/main/870000/2e0f494c2a31ba864891ff21e2625b9a/
http://93.174.95.29/main/870000/ce62a967106ba1695ff5065f2c8735a9/
http://93.174.95.29/main/870000/53e6a9293463cbebac96a8c34fe28994/
http://93.174.95.29/main/870000/50528fb9f8ad4def9b1efc53ba9f7df5/

at the end of the url, any name can be added, it will be the name of the file.

http://93.174.95.29/main/870000/50528fb9f8ad4def9b1efc53ba9f7df5/somebook.mobi

The number I represented with the "?" I don't know how it's obtained. I tried to look at the database dump but it wasn't present in any table. I only managed to import the dump from the compact version; the full version is over 12gb and my computer couldn't handle it lol

There are other python scripts here at github that also download the books at http://gen.lib.rus.ec. Perhaps they can serve as a reference. Like the https://github.com/adolfosilva/libgen.py and that api https://github.com/mmarquezs/libgen-python-api There are several others like that.

thank you for your attention

parisbastien commented 4 years ago

That shouldn't be difficult to implement.

I just need to figure out how the requests are made on http://gen.lib.rus.ec/ for the books you're talking about.

For that purpose, I need you to tell me how you're searching for these books on http://gen.lib.rus.ec/ (ISBN for instance ?)

fabianski7 commented 4 years ago

I use a list of hash md5 that I exported from the database dump.

But the site also supports searching for other terms, like the ISBN you quoted. I think the hash search is more appropriate, because every book that is there is identified by the hash.

parisbastien commented 4 years ago

I'll probably figure out an ISBN search method, as when looking for downloading a book, we retrieve the ISBN and not its linked hash on http://gen.lib.rus.ec/

What do you think about it?

fabianski7 commented 4 years ago

It might be a good option.

At the moment I don't have an ISBN list available, but I remember seeing some books that didn't have this identification and some that had more than one unique identifier.