Closed hcoura closed 6 years ago
Scrapy 1.0 doesn't have without_none_values
which is making Travis fail. This method was introduced on 1.1 and used on DefaultHeadersMiddleware. So I assume for a important reason.
I wonder if we should actually support Scrapy 1.0, it's just a None
check anyway, but I don't see a reason for supporting a version that old.
Merging #58 into master will increase coverage by
0.39%
. The diff coverage is100%
.
@@ Coverage Diff @@
## master #58 +/- ##
==========================================
+ Coverage 94.16% 94.55% +0.39%
==========================================
Files 2 2
Lines 137 147 +10
==========================================
+ Hits 129 139 +10
Misses 8 8
Impacted Files | Coverage Δ | |
---|---|---|
scrapy_crawlera/middleware.py | 94.48% <100%> (+0.4%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 63650a6...692b165. Read the comment docs.
IIUC, the headers defined in DEFAULT_REQUEST_HEADERS
will take precedence over the ones defined in CRAWLERA_DEFAULT_HEADERS
. But this is only true if Crawlera middleware is placed after Scrapy's DefaultHeadersMiddleware
, otherwise we'll have the opposite behavior.
As for the docs, I think we should mention that precedence will change depending on where you place your middleware. The docs already suggest adding it with the 610
priority, which is fine, but maybe better to be explicit regarding this when describing the headers behavior.
I like the idea in general and the code looks good. 👍
@stummjr @hcoura when did we decide to go from DEFAULT_CRAWLERA_HEADERS
to CRAWLERA_DEFAULT_HEADERS
, it looks inconsistent to me. What do you guys think?
It does look inconsistent, likely a oversight of mine back in September.
What would you rather do?
I just wanted confirmation that this isn't the way we should go, I thought it was discussed and agreed that should be the name of the settings. I'll add the necessary changes.
Looking at this again, I don't think it is inconsistent, every setting on this repo has been CRAWLERA_...
After discussion on https://github.com/scrapy-plugins/scrapy-crawlera/pull/54 it was clear that DEFAULT_CRAWLERA_HEADERS was the preferable way to handle crawlera specific headers.
After much thinking on https://github.com/scrapy-plugins/scrapy-crawlera/pull/52 I believe, for now, we shouldn't change default headers as it will not be backwards compatible.
This PR:
If you are ok with this, let me know and I will add documentation accordingly