Gerapy / GerapyPlaywright

Downloader Middleware to support Playwright in Scrapy & Gerapy
106 stars 24 forks source link

Method is not Json serializable for actions #15

Open BenzTivianne opened 1 year ago

BenzTivianne commented 1 year ago

Hello,

I am running into an issue with 'yield' for gerapy-playwright when I need to access a login page. I try to run: yield PlaywrightRequest(login_page_url, self.parse_login, actions = self.login_action) in order to first use playwright to login and then access data that can only be accessed when logging in with self.parse_login. I am getting a: builtins.TypeError: is not JSON serializable.

I am using scrapy cluster along with gerapy-playwright in order to run a scheduler for all the spiders that I have: https://github.com/istresearch/scrapy-cluster

It seems that the action is saved in the meta data as a method and cannot be passed to the scheduler. Is it possible to type cast the action as a string and then when the action is called later, to do a method call on the string? If I understand correctly, the action is produced on line 339 of the downloadermiddlewares.py inside of gerepy-playwright. Would it be possible to evaluate the string as a method so that the scrapy-cluster scheduler can pass the string but gerapy-playwright still calls the self.login_action method prior to the self.parse_login?