scrapy / scrapyd

A service daemon to run Scrapy spiders
https://scrapyd.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
2.92k stars 569 forks source link

API - add items and logs url to listjobs.json #464

Closed mxdev88 closed 1 year ago

mxdev88 commented 1 year ago

After making a call to listjobs.json, I'd like to download the items or logs related to the finished jobs. Would it make sense to add something like below?

2 new attributes:

{
    "status": "ok",
    "pending": [],
    "running": [],
    "finished": [
        {
            "id": "2f16646cfcaf11e1b0090800272a6d06",
            "project": "myproject", "spider": "spider3",
            "start_time": "2012-09-12 10:14:03.594664",
            "end_time": "2012-09-12 10:24:03.594664",
            "items_url": "/items/myproject/spider3/2f16646cfcaf11e1b0090800272a6d06.jl",
            "log_url": "/logs/myproject/spider3/2f16646cfcaf11e1b0090800272a6d06.log"
        }
    ]
}

If so, I could submit a PR for it.

jpmckinney commented 1 year ago

That seems reasonable. That said, couldn't these URLs be constructed based on the available data?

mxdev88 commented 1 year ago

That's true. Currently the client can build the url based on project, spider and job id but my reasoning here is that the client should be dumb and rely on the server to know where to locate the results. In case the logic is changed server side one day, it would be transparent. Makes sense?

jpmckinney commented 1 year ago

Yup, fine with me.