scrapinghub / python-hubstorage

Deprecated HubStorage client library - please use python-scrapinghub>=1.9.0 instead
https://github.com/scrapinghub/python-scrapinghub
Other
16 stars 12 forks source link

jobs.list() is broken #38

Open eliasdorneles opened 8 years ago

eliasdorneles commented 8 years ago

I'm getting bad request when using project.jobs.list():

>>> import hubstorage
>>> hubstorage.__version__
'0.18.1'
>>> from hubstorage import HubstorageClient
>>> import os
>>> apikey = os.getenv('SHUB_APIKEY')
>>> client = HubstorageClient(apikey)
>>> project = client.get_project(7389)
>>> project.jobs.list()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/elias/.local/lib/python2.7/site-packages/hubstorage/project.py", line 69, in list
    return self.apiget(_key, params=params)
  File "/home/elias/.local/lib/python2.7/site-packages/hubstorage/resourcetype.py", line 40, in apiget
    return self.apirequest(_path, method='GET', **kwargs)
  File "/home/elias/.local/lib/python2.7/site-packages/hubstorage/resourcetype.py", line 34, in apirequest
    return jldecode(self._iter_lines(_path, **kwargs))
  File "/home/elias/.local/lib/python2.7/site-packages/hubstorage/resourcetype.py", line 30, in _iter_lines
    r.raise_for_status()
  File "/home/elias/.local/lib/python2.7/site-packages/requests/models.py", line 837, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: BAD REQUEST for url: http://storage.scrapinghub.com/jobs/7389

Is this expected?

bertinatto commented 8 years ago

@eliasdorneles, my 2 cents:

It seems that HubStorage's /jobs/ endpoint no longer support scanning[1], so you probably want to use project.jobq.list() instead. It'd be even better if you could use project.get_jobs(), but it depends on what you need.

IMO we could ditch Jobs()'s list() (or even the whole class, since it has only one method) or make a call to JobQ()'s list() to keep compatibility and make it easier for applications currently using python-hubstorage. What do you think, @dangra?

[1] https://github.com/scrapinghub/python-hubstorage/commit/758dcd8edb990b4b4e907fc38578eaff2da8a237#diff-72f6362474c91e49b25ba9c4fb5dab86L45

dangra commented 8 years ago

@eliasdorneles, @bertinatto is correct. Lists on /jobs/ no longer works and /jobq/NN/list should used instead. We can provide a semi backward compatible solution routing /jobs/NN/list to jobq/NN/list, the main difference is that not all job metadata is returned (you have to ask for the specific meta fields) and the order of the result changed. Most recent jobs are returned first.

According to service logs the projects using that api were already migrated, did I miss one?

eliasdorneles commented 8 years ago

Well, I had to fix a script that we run periodically.

@bertinatto +1 to updating hubstorage client library, but instead of removing the job I'd raise an exception with a nice message telling user the new way of doing things.

bertinatto commented 8 years ago

@eliasdorneles :+1:

Updated PR.