nilfoer / mangadb

Local database including a web front-end for managing mangas/doujinshi.
MIT License
6 stars 1 forks source link

Extractor Requests. #5

Closed ruchisa-dev closed 3 years ago

ruchisa-dev commented 3 years ago

Hi, I know I am being a big trouble but sorry!

nilfoer commented 3 years ago

The extractors are done, but toonily.com uses Cloudflare for DDoS protection (e.g. the 'Checking your browser...' screen) and when the site is in "I'm under attack mode" (which is not all the time, but currently most of the time) then the extractor will have to circumvent that protection. Currently there's no easy way to do that automatically. There are semi-automatic options that all involve using your browser to get the cookies to show Cloudflare that you're not a bot.

  1. The semi-automatic option I have in mind is that you'd have to get the cookies and User-Agent of your browser manually using the browser's developer tools (open with F12) and then pass them to MangaDB using the webGUI or the command-line. But I don't know if the average user would be able to do that.
  2. Or I could make the browser extension work again and use that to either pass the cookies or the whole page content to MangaDB. This is obviously easier for the user, but takes way longer to implement. There's also tons of added complexity to support the combinations of GNU/Linux and Windows with either the standalone or python script version. Then should we try to get the addon on the offical addon page, since users otherwise have to load the extension into their browser manually (and they have to re-do it after a restart).
  3. Maybe make a simpler extension that reads the cookies and User-Agent and shows it to the user, so they can copy and paste it to MangaDB?

Which option would you prefer?

ruchisa-dev commented 3 years ago

I mean, I dont want you to waste your too much time to my requests so I think 3rd option would be best. Also thanks for everything I wont ask something for a while :D

ghost commented 3 years ago

I wish I could see this before, I would recommend https://mangasee123.com/ or https://manga4life.com/ over manganelo since manganelo is low quality site because of it is an aggregator website. Also I would prefer 3rd option either since it is easier option to anyone. (to you and users)

nilfoer commented 3 years ago

[...] I would recommend https://mangasee123.com/ or https://manga4life.com/ [...]

They look almost the same and when I googled the difference between those peole said that manga4life is basically the beta version of mangasee123. So I guess supporting just mangasee123 is fine. Or do they have enough differing content that makes it worth it to support both?

ghost commented 3 years ago

As far as I know mangasee has a litlte more manga than mangalife. They are almost same.

ruchisa-dev commented 3 years ago

it seems both extractors works thanks, i just wonder must I do "--cookies" while collecting books from toonily in cli mode? Also should I close the issue now? My extractor requests are done and thanks for it but it seems @toprak wants mangasee.

nilfoer commented 3 years ago

Yeah you need to use --cookies if you're using the CLI. Not necessarily while adding/collecting the links, but you'll need the cookies (assuming toonily.com is in "I am under attack mode") when you use the import command.

You can leave it open until I finish the mangasee123 extractor.

nilfoer commented 3 years ago

I added the extractor for mangasee123 to the new release. The site has a very weird structure though so the extractor is not as robust as I'd like.

ruchisa-dev commented 3 years ago

I could not get any spare time to look into this, finally I got and it works hella good, thank you!