driplineorg / dripline-python

python implementation of project8/dripline
Apache License 2.0
3 stars 6 forks source link

entities don't handle being set to np.inf #133

Open raphaelcervantes opened 3 years ago

raphaelcervantes commented 3 years ago

My data taking scripts ran into an edge case where it tried to set an entity to np.inf. The script just hung. I didn't even see an error in the service logs. I think the dripline should catch this case and try to intentionally crash or throw an error.

To go into more detail, here is the relevant portion of my control script.

                popt_reflection, pcov_reflection = data_lorentzian_fit(s11_pow, freq, 'reflection')
                perr_reflection = np.sqrt(np.diag(pcov_reflection))

                print('Reflection lorentzian fitted parameters')
                print(popt_reflection)
                self.cmd_interface.set('f_reflection', popt_reflection[0])
                self.cmd_interface.set('sig_f_reflection', perr_reflection[0])
                self.cmd_interface.set('Q_reflection', popt_reflection[1])
                self.cmd_interface.set('sig_Q_reflection', perr_reflection[1])
                self.cmd_interface.set('dy_reflection', popt_reflection[2])
                self.cmd_interface.set('sig_dy_reflection', perr_reflection[2])
                self.cmd_interface.set('C_reflection', popt_reflection[3])
                self.cmd_interface.set('sig_C_reflection', perr_reflection[3])

My script couldn't perform the fit on the VNA trace.

 Setting na_measurement_status to start_measurement
Logging list of endpoints
Switching to transmission path
Switching to reflection path
Switching to transmission path
VNA reflection measurement
Setting na_measurement_status to start_measurement
Logging list of endpoints
Switching to transmission path
Transmission lorentzian fitted parameters
[1.61968618e+10 1.59075036e+04 2.62957230e-01 3.45835612e-03]
Switching to reflection path
/usr/local/lib/python3.7/site-packages/scipy/optimize/minpack.py:829: OptimizeWarning: Covariance of the parameters could not be estimated
  category=OptimizeWarning)
Reflection lorentzian fitted parameters
[1.61903696e+10 4.84379319e+00 2.84176343e+02 2.84231409e+02]

I think when I get an OptimizeWarning error, the values of my pcov are np.inf, so it tried to set sig_f_reflection to np.inf and just hung without making any sort of progress.

➜  ~ kubectl logs -f double-precision-logger-dripline-python-deployment-6b7fbf8d4b7h --tail 20 
{'timestamp': '2021-03-22T16:32:18.386845Z', 'sensor_name': 'Q_transmission', 'value_cal': 15907.503649437182, 'value_raw': 15907.503649437182}
2021-03-22T16:32:18[INFO    ] dripline.implementations.postgres_sensor_logger(49) -> finished processing data
2021-03-22T16:32:18[INFO    ] dripline.implementations.postgres_sensor_logger(46) -> insert data are:
{'timestamp': '2021-03-22T16:32:18.399052Z', 'sensor_name': 'sig_Q_transmission', 'value_cal': 570.3355496234211, 'value_raw': 570.3355496234211}
2021-03-22T16:32:18[INFO    ] dripline.implementations.postgres_sensor_logger(49) -> finished processing data
2021-03-22T16:32:18[INFO    ] dripline.implementations.postgres_sensor_logger(46) -> insert data are:
{'timestamp': '2021-03-22T16:32:18.411096Z', 'sensor_name': 'dy_transmission', 'value_cal': 0.26295723019139855, 'value_raw': 0.26295723019139855}
2021-03-22T16:32:18[INFO    ] dripline.implementations.postgres_sensor_logger(49) -> finished processing data
2021-03-22T16:32:18[INFO    ] dripline.implementations.postgres_sensor_logger(46) -> insert data are:
{'timestamp': '2021-03-22T16:32:18.423272Z', 'sensor_name': 'sig_dy_transmission', 'value_cal': 0.010107652156098478, 'value_raw': 0.010107652156098478}
2021-03-22T16:32:18[INFO    ] dripline.implementations.postgres_sensor_logger(49) -> finished processing data
2021-03-22T16:32:18[INFO    ] dripline.implementations.postgres_sensor_logger(46) -> insert data are:
{'timestamp': '2021-03-22T16:32:18.436062Z', 'sensor_name': 'C_transmission', 'value_cal': 0.003458356118361542, 'value_raw': 0.003458356118361542}
2021-03-22T16:32:18[INFO    ] dripline.implementations.postgres_sensor_logger(49) -> finished processing data
2021-03-22T16:32:18[INFO    ] dripline.implementations.postgres_sensor_logger(46) -> insert data are:
{'timestamp': '2021-03-22T16:32:18.449049Z', 'sensor_name': 'sig_C_transmission', 'value_cal': 0.00015618094767875254, 'value_raw': 0.00015618094767875254}
2021-03-22T16:32:18[INFO    ] dripline.implementations.postgres_sensor_logger(49) -> finished processing data
2021-03-22T16:32:26[INFO    ] dripline.implementations.postgres_sensor_logger(46) -> insert data are:
{'timestamp': '2021-03-22T16:32:26.161902Z', 'sensor_name': 'f_reflection', 'value_cal': 16190369623.81609, 'value_raw': 16190369623.81609}
2021-03-22T16:32:26[INFO    ] dripline.implementations.postgres_sensor_logger(49) -> finished processing data
2021-03-22T16:32:18[INFO    ] dripline.implementations.postgres_sensor_logger(49) -> finished processing data
2021-03-22T16:32:26[INFO    ] dripline.implementations.postgres_sensor_logger(46) -> insert data are:
{'timestamp': '2021-03-22T16:32:26.161902Z', 'sensor_name': 'f_reflection', 'value_cal': 16190369623.81609, 'value_raw': 16190369623.81609}
laroque commented 3 years ago

It would be useful to understand where this is failing, dripline should be trying to serialize this in json, which probably won't actually do what you want but I think should work from the client side. I'd bet the service doesn't know what to do with whatever that gets rendered into though.

raphaelcervantes commented 3 years ago

@laroque Is there anything you want me to do on my end to diagnose this?

For now, I have my control scripts throw an exception whenever my curve_fit throws an error.

laroque commented 3 years ago

I've been mulling over this but I haven't dug into the code (much less actually tried to reproduce). It isn't clear to me if this is a problem on the client side with sending the message, or on the server side with responding to it. The solution will be very dependent on that. Several different paths that we could follow up:

Question

Can you help clarify where it is failing? I think the steps are something like:

  1. your python code makes a call to core dripline to send a message
  2. there are calls down to the C++ which converts to a message object
  3. the actual message is sent over AMQP
  4. the message is received by the service and decoded back into some native objects
  5. your custom Entity does something with the message object it received

this leads to....

Hypothesis

I expect that the problem is that above in step 2 the C++ implemented binding doesn't know how to deal with a numpy array and/or a numpy.infinity type when converting to a scarab Param object (for constructing a dripline message and eventually serializing to json). A useful check of this idea would be to use something like the monitor subcommand, or maybe just watching entity logs to see if those calls to set are actually producing a dripline message on the AMQP bus, and if those messages look sensible. If this fails then the issue is with (numpy.*) -> (dripline Message) conversion. Another possibility (if the above is all working) would be that the Entity is receiving the message but the types don't make sense and something in the code is failing; for example you start with numpy.array([1,2,3,numpy.infinity]) (a numpy array with numeric values that may include infinity) in the client but what you end up with on the server is [1, 2, 3, "Infinity"] (a list with a mix of strings and numbers).

Sidestep/workaround

We should probably figure out a place to document this more clearly, but dripline as written doesn't support arbitrary data types from outside of the standard python library (for example, numpy types). There are a couple of issues around this:

... that was a distraction, what I meant to say was: If you convert your data to native python types before calling set, then dripline should be able to deal with that data the same way that it always does. Then if your Entity needs to consume data from numpy types, possible with special values like infinity, it is your responsibility as the implementer of the entity to convert the native python types you get in the message payload to the custom data type you need in your implementation.

nsoblath commented 3 years ago

We could also look at how something like JSON or YAML encoding works for objects like that in Python. I presume there must be some fairly transparent way that the translation in either direction between a numpy array and the fairly limited types of data allowed by JSON and YAML is done.

raphaelcervantes commented 3 years ago

I'm trying to push hard on Orpheus commissioning right now. I think I'll do some DAQ development next week and can look at this then.

I'd be ok if it was required of me to convert to a native python type as long as that requirement is explicit. But I think it would be good if dripline threw an error when I didn't do that, rather than just hang and stall, necessitating me to manually kill the docker container.

laroque commented 3 years ago

Your expectation here (getting an exception & an explicit requirement in the docs) is reasonable and is what I would have expected to happen. I would have expected failed type conversion to produce an error, I would have expected a failed attempt to send a message to time out and produce a (possibly not so helpful) error.

I think that this is something that we (the dripline side) should fix to basically the state you asked for, the problem is just finding someone with the time to dig in and isolate and fix the problem. The suggestion to convert to python types may or may not end up being the only option long term, but is probably the fastest solution to your problem let you get back to focus on Orpheus, since I don't know that we'll be able to resolve it by then.

raphaelcervantes commented 3 years ago

Here is another instance of dripline not being able to handle numpy arrays.

I tried to use the scipy.interpolate.interp1d function in a dripline extension like so

    interpolated_function = interp1d(resistance_cal, temperature_cal, kind = 'cubic')
    interpolated_temperature = interpolated_function(resistance)

This returns single-valued numpy array, assuming resistance is single-valued.

I see this error in my k8 logs when the calibration function gets called.

2021-06-15T15:30:23[DEBUG   ] dripline.core.calibrate(43) -> formatted cal is:                                                                                                                                                                
x83871_cal(+2.02340640E+03)                                                                                                                                                                                                                   
2021-06-15 15:29:25 [ERROR] rary/endpoint.cc(191): Caught exception from Python: RuntimeError: Unknown python type cannot be converted to param                                                                                               

At:                                                                                                                                                                                                                                           
  /usr/local/lib/python3.8/site-packages/dripline/core/endpoint.py(41): do_get_request      

To work around this, I casted the interpolated result as a float.

https://github.com/axiondarkmatterexperiment/dripline-orpheus/blob/0a337da26df6f580e328599b6b273d985e44126b/dripline/extensions/agilent34970A/muxer_calibrations.py#L24