calgo-lab / green-db

The monorepo that powers the GreenDB.
https://calgo-lab.github.io/green-db/
22 stars 2 forks source link

Improve startjob_test #153

Closed en-GB closed 11 months ago

en-GB commented 1 year ago

Check if startjob urls are actually valid. And also if they contain commas. Scrapy interprets urls with commas in them as lists of urls which is unwanted.

BigDatalex commented 1 year ago

Would it be possible to import the scrapy method which parses the URLs and check whether the result is the same URL as the original one? That way we would not need to check manually for commas and maybe there are some more characters that can cause errors, which the test does not cover yet.

en-GB commented 1 year ago

That would be ideal. I just can't find the spot.