Closed gas1121 closed 7 years ago
Can you elaborate on your use case for this? There originally was a reason we used a custom request_to_dict()
set of methods. In comparing this with the changes above I don't see anything out of the ordinary at this point. I would like to add a new unit test to show that priority crawl submission is still respected, and may have to do some minor testing to make sure everything is still working correctly.
I did this change because when I use scrapy.FormRequest.from_response to generate a post request, sometimes it will contain parameters in request's body, which will be lost when being deserialized from redis.
Ok, do you think that this supports all styles of Scrapy requests? For example I know there have been issues in the past with Splash
requests which has been a long sought after feature, but prior discussion like on https://github.com/istresearch/scrapy-cluster/issues/108 moved towards continuing to customize the encoding/decoding rather than using the default scrapy setup.
As request_from_dict, request_from_dict will handle meta data in request and load right class when its type is not scrapy.http.Request, these function should work well with SplashRequest. Also I did some tests to scrapy-splash with a simple scrapy project and SplashRequest works fine with request_to_dict and request_from_dict, so I think it should also works fine in scrapy-cluster.
Cool, I will have to compare it with the work I did over on this commit https://github.com/istresearch/scrapy-cluster/commit/60e1b9d201df94a9b3f2e06a5b870b6918c58c75 but I bet the combination of these two items will get full support for splash within scrapy cluster.
I can't find anything wrong with the updates for this PR. All tests pass and this seems like an improved implementation of the cloned Scrapy function request_to_dict
. Merged.
Use scrapy's request_to_dict,request_from_dict to serialize requests so we can handle post request with parameters in request body.