jrxFive / python-nomad

Client library Hashicorp Nomad
https://python-nomad.readthedocs.io/en/latest/
MIT License
139 stars 73 forks source link

dispatch_job with payload doesn't work #58

Closed alexiri closed 3 years ago

alexiri commented 6 years ago

I noticed that I wasn't able to dispatch a job with a payload, and there was no test covering this scenario. I've fixed it and added a test.

codecov-io commented 6 years ago

Codecov Report

Merging #58 into master will increase coverage by 0.08%. The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #58      +/-   ##
==========================================
+ Coverage   87.14%   87.23%   +0.08%     
==========================================
  Files          26       26              
  Lines        1167     1175       +8     
==========================================
+ Hits         1017     1025       +8     
  Misses        150      150
Impacted Files Coverage Δ
nomad/api/job.py 95.18% <100%> (+0.51%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 1e929b9...d05d5cb. Read the comment docs.

jrxFive commented 6 years ago

Hey @alexiri, thanks for the PR! I don't think its safe to assume that the payload will be JSON, we could have all types of different types ini, toml, yaml, anything really.

I think it would be best to assume that the payload has already been base64 encoded before calling the dispatch method. I would like to keep this test though if thats okay and if you don't mind reversing the changes for job.py? It was really great on your part to add that, lemme know what you think. Thanks!

alexiri commented 6 years ago

Ok, I agree that assuming the payload is JSON is wrong, but in the absence of documentation clarifying that parameter it isn't completely unreasonable.

However, I still think this function should do the encoding itself. The fact that the payload has to be base64 encoded is an internal detail of the API that the user doesn't need to know about. They should just be able to pass whatever they want to this function and it should be sent to Nomad transparently.

How about this change instead?

jrxFive commented 6 years ago

Only issue with that change is that it could be a potentially breaking change if user's of the library are already sending as an base64 string to the function, we would re-run an encoding of that on that given parameter so whenever the attempt to decode it would still be in base64 and would require another decode. I want to accept this but I also don't want things to break for people.

https://stackoverflow.com/questions/8571501/how-to-check-whether-the-string-is-base64-encoded-or-not

As much as I would hate to add regex to this problem if we could ensure that the string is already base64 encoded. Or ask for forgiveness and attempt a decode if it works use the payload, if not serialize and encode. I'd feel alot better about merging this and updating documentation to the dispatch job in the nomad docs and method docstring.

Sorry for all the back and forth along with the delay, I really appreciate the PR and suggestions.

alexiri commented 6 years ago

I thought about that also, but there's really no way to be sure if something is base64-encoded or not. The string "hola" would pass that regexp, but isn't actually base64 encoded.

I think this is one of those times you have to bite the bullet and break backwards compatibility in order to fix a bug. As breakages go it's not too bad, users just have to remove one function call that is now redundant. Anyway, I'm guessing not that many people use this function, otherwise they would have complained earlier.

erhlee-bird commented 6 years ago

Ok, I agree that assuming the payload is JSON is wrong, but in the absence of documentation clarifying that parameter it isn't completely unreasonable.

Adding the test case and documentation for this particular detail of the Nomad dispatch API seems to be the much safer option in handling this change.


I would strongly favor not making this change as it would break our team's workflow in two specific ways.

  1. We specifically use a different serialization format than json before base64-encoding to handle binary data.
# Simplified for brevity
>>> client = nomad.Nomad()
>>> client.job.dispatch_job("my-dispatch-job", payload={"binpack": [b"bytes", b"more bytes"]})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.7/site-packages/nomad/api/job.py", line 240, in dispatch_job
    return self._post(id, "dispatch", json_dict=dispatch_json)
  File "/usr/local/lib/python3.7/site-packages/nomad/api/job.py", line 157, in _post
    response = self._requester.post(url, json=kwargs.get("json_dict", None), params=kwargs.get("params", None))
  File "/usr/local/lib/python3.7/site-packages/nomad/api/base.py", line 117, in post
    timeout=self.timeout)
  File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 559, in post
    return self.request('POST', url, data=data, json=json, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 498, in request
    prep = self.prepare_request(req)
  File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 441, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "/usr/local/lib/python3.7/site-packages/requests/models.py", line 312, in prepare
    self.prepare_body(data, files, json)
  File "/usr/local/lib/python3.7/site-packages/requests/models.py", line 462, in prepare_body
    body = complexjson.dumps(json)
  File "/usr/local/lib/python3.7/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/usr/local/lib/python3.7/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/local/lib/python3.7/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/usr/local/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type bytes is not JSON serializable
  1. We have payloads with strings that are not utf-8 encoded. Wrapping such strings (utf-16-le for example) in our preferred serialization format allows us to preserve our payload transparently through to the Nomad API and to not be surprised by unexpected pre-processing of our payload.
alexiri commented 6 years ago

Fine, how about only doing the base64 encoding automatically? The issue is for python 2 that requires a string and for python 3 a byte-like object, but I guess it could be explained in the documentation.

erhlee-bird commented 6 years ago

Automatically performing the base64 encoding is still making assumptions on the input encoding as well as potentially redundantly handling inputs that are already base64-encoded.

>>> base64.b64encode("Hello World")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.7/base64.py", line 58, in b64encode
    encoded = binascii.b2a_base64(s, newline=False)
TypeError: a bytes-like object is required, not 'str'

The Nomad API will tell you if your input is malformed. I think effort in making the error if it occurs like with #59 clearer and improving the documentation are much better solutions in this scenario.