TheOnlyWayUp / WattpadDownloader

Wattpad Books to EPUB Files. 🪙Download Paid Stories! Metadata and Cover Support 🏷️, Lightweight Frontend 🐇, Dockerized 🐳, Rapid Generation ⚡, API Support 🌐
https://wpd.rambhat.la
68 stars 12 forks source link

Exception thrown when executing multiple parallel requests #2

Closed Edgeburn closed 3 months ago

Edgeburn commented 5 months ago

I am running an instance of this application as part of my own project that sends multiple requests to it in parallel. Although the application works perfectly when responding to one request at a time, it fails when hit with multiple at once, and outputs this to the logs

INFO:     172.19.0.4:59578 - "GET /download/[REDACTED] HTTP/1.1" 500 Internal Server Error                                                                                                                                                                                             
wattpad-downloader         | ERROR:    Exception in ASGI application                                                                                                                                                                                                                                              
wattpad-downloader         | Traceback (most recent call last):
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi                                                                                                                                                                           
wattpad-downloader         |     result = await app(  # type: ignore[func-returns-value]                                                                                                                                                                                                                          
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__                                                                                                                                                                           
wattpad-downloader         |     return await self.app(scope, receive, send)                                                                                                                                                                                                                                      
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__                                                                                                                                                                                     
wattpad-downloader         |     await super().__call__(scope, receive, send)
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 116, in __call__                                                                                                                                                                                    
wattpad-downloader         |     await self.middleware_stack(scope, receive, send)                                                                                                                                                                                                                                
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in __call__                                                                                                                                                                               
wattpad-downloader         |     raise exc                                                                                                                                                                                                                                                                        
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in __call__                                                                                                                                                                               
wattpad-downloader         |     await self.app(scope, receive, _send)                                                                                                                                                                                                                                            
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
wattpad-downloader         |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)                                                                                                                                                                                                         
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 55, in wrapped_app                                                                                                                                                                            
wattpad-downloader         |     raise exc                                                                                                                                                                                                                                                                        
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 44, in wrapped_app                                                                                                                                                                            
wattpad-downloader         |     await app(scope, receive, sender)                                                                                                                                                                                                                                                
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 746, in __call__
wattpad-downloader         |     await route.handle(scope, receive, send)                                                                                                                                                                                                                                         
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle                                                                                                                                                                                           
wattpad-downloader         |     await self.app(scope, receive, send)                                                                                                                                                                                                                                             
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 75, in app                                                                                                                                                                                               
wattpad-downloader         |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)                                                                                                                                                                                                           
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 55, in wrapped_app
wattpad-downloader         |     raise exc                                                                                                                                                                                                                                                                        
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 44, in wrapped_app                                                                                                                                                                            
wattpad-downloader         |     await app(scope, receive, sender)                                                                                                                                                                                                                                                
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 70, in app                                                                                                                                                                                               
wattpad-downloader         |     response = await func(request)                                                                                                                                                                                                                                                   
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 299, in app                                                                                                                                                                                                
wattpad-downloader         |     raise e
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 294, in app                                                                                                                                                                                                
wattpad-downloader         |     raw_response = await run_endpoint_function(                                                                                                                                                                                                                                      
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 191, in run_endpoint_function                                                                                                                                                                              
wattpad-downloader         |     return await dependant.call(**values)                                                                                                                                                                                                                                            
wattpad-downloader         |   File "/app/main.py", line 30, in download_book                                                                                                                                                                                                                                     
wattpad-downloader         |     async for title in add_chapters(book, data):                                                                                                                                                                                                                                     
wattpad-downloader         |   File "/app/create_book.py", line 131, in add_chapters                                                                                                                                                                                                                              
wattpad-downloader         |     content = await fetch_part_content(part["id"])                                                                                                                                                                                                                                   
wattpad-downloader         |   File "/app/create_book.py", line 42, in fetch_part_content                                                                                                                                                                                                                         
wattpad-downloader         |     async with session.get(                                                                                                                                                                                                                                                          
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/aiohttp/client.py", line 1187, in __aenter__                                                                                                                                                                                         
wattpad-downloader         |     self._resp = await self._coro                                                                                                                                                                                                                                                    
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/aiohttp_client_cache/session.py", line 51, in _request
wattpad-downloader         |     response, actions = await self.cache.request(                                                                                                                                                                                                                                    
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/aiohttp_client_cache/backends/base.py", line 139, in request                                                                                                                                                                         
wattpad-downloader         |     response = None if actions.skip_read else await self.get_response(actions.key)                                                                                                                                                                                                   
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/aiohttp_client_cache/backends/base.py", line 147, in get_response                                                                                                                                                                    
wattpad-downloader         |     response = await self.responses.read(key) or await self._get_redirect_response(str(key))                                                                                                                                                                                         
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/aiohttp_client_cache/backends/base.py", line 169, in _get_redirect_response
wattpad-downloader         |     redirect_key = await self.redirects.read(key)                                                                                                                                                                                                                                    
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/aiohttp_client_cache/backends/sqlite.py", line 180, in read                                                                                                                                                                          
wattpad-downloader         |     row = await cursor.fetchone()                                                                                                                                                                                                                                                    
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/aiosqlite/cursor.py", line 65, in fetchone                                                                                                                                                                                           
wattpad-downloader         |     return await self._execute(self._cursor.fetchone)                                                                                                                                                                                                                                
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/aiosqlite/cursor.py", line 40, in _execute                                                                                                                                                                                           
wattpad-downloader         |     return await self._conn._execute(fn, *args, **kwargs)
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/aiosqlite/core.py", line 133, in _execute                                                                                                                                                                                            
wattpad-downloader         |     return await future                                                                                                                                                                                                                                                              
wattpad-downloader         |   File "/usr/local/lib/python3.10/site-packages/aiosqlite/core.py", line 106, in run                                                                                                                                                                                                 
wattpad-downloader         |     result = function()                                                                                                                                                                                                                                                              
wattpad-downloader         | sqlite3.ProgrammingError: Cannot operate on a closed database.
TheOnlyWayUp commented 4 months ago

@Edgeburn Hi, this looks like an issue with aiohttp_client_cache. You can switch CachedSession(...) in the following lines to aiohttp.ClientSession(headers=headers)

  1. https://github.com/TheOnlyWayUp/WattpadDownloader/blob/3f6eb6ed7c1bdecf4ddab95671380c066b709958/src/api/src/create_book.py#L19
  2. https://github.com/TheOnlyWayUp/WattpadDownloader/blob/3f6eb6ed7c1bdecf4ddab95671380c066b709958/src/api/src/create_book.py#L40
  3. https://github.com/TheOnlyWayUp/WattpadDownloader/blob/3f6eb6ed7c1bdecf4ddab95671380c066b709958/src/api/src/create_book.py#L61

This is the quick and dirty fix, and removes caching. Discord is better for support. Out of curiosity, what are you working on?

Edgeburn commented 4 months ago

@Edgeburn Hi, this looks like an issue with aiohttp_client_cache. You can switch CachedSession(...) in the following lines to aiohttp.ClientSession(headers=headers)

1. https://github.com/TheOnlyWayUp/WattpadDownloader/blob/3f6eb6ed7c1bdecf4ddab95671380c066b709958/src/api/src/create_book.py#L19

2. https://github.com/TheOnlyWayUp/WattpadDownloader/blob/3f6eb6ed7c1bdecf4ddab95671380c066b709958/src/api/src/create_book.py#L40

3. https://github.com/TheOnlyWayUp/WattpadDownloader/blob/3f6eb6ed7c1bdecf4ddab95671380c066b709958/src/api/src/create_book.py#L61

This is the quick and dirty fix, and removes caching. Discord is better for support.

If you don't end up pushing these changes, I'll definitely go ahead and fork to make them. Much appreciated!

Out of curiosity, what are you working on?

I'm building a book library + archiver for my girlfriend's enormous Wattpad book collection, and I am using a few instances of your project as an API call to download copies of all 2k+ books; it's worked fantastically aside from this issue, and I appreciate your work!

TheOnlyWayUp commented 4 months ago

Hey @Edgeburn, that sounds sickkk. Are you using Calibre to store the library?

On the topic of caching, I likely won't be removing it on the master branch.

Caching is especially useful during ratelimits - If we're downloading a 200 part book and get ratelimited on the 50th part, the user can retry and continue from the 51st (the first 50 will hit the client cache, valid for 12 hours).

Tbf I didn't know aiohttp_client_cache didn't hold up to rapid requests, I'll have to try it out later (haven't seen this issue in my instance)

If you really wanna speed things up, you can use asyncio.gather on this function

https://github.com/TheOnlyWayUp/WattpadDownloader/blob/master/src/api/src/main.py#L20C1-L51C6

After changing the return value and removing the router decorator. You'd have to find a way to deal with ratelimits, the backoff library + semaphores or chunking should help

There's no obligation to do any of this haha, I'm hoping our thread will be useful to others trying to download a lot of books.

I'd love to hear more about your project, how can I get in touch?

Edgeburn commented 4 months ago

Although I've used Calibre in order to convert some Amazon Kindle books to epub, my project is all entirely custom-built. It stores everything in a MariaDB database, including the epubs encoded in base64.

The rate limiting on Wattpad's end definitely was a bit of an issue, and my hacky workaround was basically just to deploy 4 instances of your program across 4 separate servers and have mine simply cycle through them all, which worked well enough.

I appreciate your interest in my project! It's written in Java using the Spark framework, with basic HTML, CSS, and JavaScript webpages for the frontend. (Web frontend is not my strong suit at all, lol.)

The project is still in very early stages and is not yet public on GitHub. However, I intend to create a publicly usable version. The master branch was forked for a more basic version to get her book collection archived as quickly as possible.

If you're interested in it, please feel free to email me at edgeburn@edgeburnmedia.com or message me on Discord at @edgeburn02.