Why is the setpoint only configured to two decimal points of precision?

dom-insytesys commented 4 years ago

We're encountering an issue with our test system where we divide a fixed total flow between various MFC's that have different max flow capabilities. The total flow often shows small, but noticeable errors that are degrading our experimental data.

I finally tracked the problem down to the fact that _set_setpoint() in serial.py only sets the setpoint to two decimal places of precision. Our flow rates are getting rounded up by this function. e.g. 0.376 SLPM is being set as 0.38 SLPM. This flow controller is a 1 SLPM max flow device, so the code is artificially limiting its precision to 1 part in 100, whereas the actual nominal capability is 1 part in 64,000. The driver is effectively degrading the precision of the device.

To my mind, this is clearly a bug.

In the documentation, I see that the serial protocol of these MFC's allows for the setpoint to be configured as an integer between one and 64000. So I can work around the problem by writing my own version of set_flow_rate(), but it's really frustrating that the standard API doesn't provide a way to access this capability.

patrickfuller commented 4 years ago

I've run into this as well, and I'll explain why the library currently works the way it does.

Explanation

Alicat has a main command (pg42 of the manual) that forces the decimal truncation, e.g. as15.44. This command is stable across Alicat versions. It works with older and newer ones, even if lower-level commands change.

Below this command is a low-level API that allows you to set specific memory addresses. This usually follows the pattern aw122=60000, which means "Unit A, Write to address 122 the value 60000". Each address, or "register", is 16bit so supports 0-65535. In newer Alicats, they included the shorthand a60000 to write to the setpoint register.

Here's the frustrating part - the register shorthand doesn't work on all Alicats. Furthermore, the registers change between Alicat versions, ie. address 122 may be 65 on an older Alicat.

Proper Fix

If I had time and Alicat support, I'd probably compile a big list of Alicat versions + register maps. I'd query the device on connection to figure out what version it's running, and then be able to automatically expose an appropriate python API while hiding the memory addresses.

Short-term Fix

What if you change the units from SLPM to sccm? It should be doable on the device front panel.

dom-insytesys commented 4 years ago

Thanks for the speedy and informative reply!

I wasn't aware that changing the units on the front panel affected the units used to program the device over serial comms. I'll give that a try. The comments in set_flow_rate() reads "flow: The target flow rate, in units specified at time of purchase", which makes it sound like this is fixed and not user-configurable.

patrickfuller commented 4 years ago

Agreed, but I know we have some 1SLPM controllers that are being controlled by sccm. If it doesn’t work, then I’ll show you how to monkey patch the library to use the register.

dom-insytesys commented 4 years ago

Just read some the manual that you linked to. On page 20, it says that you can set "Button engineering units" or "Device engineering units". If I understand it correctly, the former affects the front panel only, the latter changes both front panel and serial communications.

So that should be an acceptable workaround. But in order to write reliable code (i.e. that operates consistently whether or not the front panel has been adjusted correctly), I really need to be able to query and configure the units via the serial comms. There's nothing in the "Serial Command Guide" (page 43) that describes how to do this. Although it does say helpfully "If you have need of more advanced serial communication commands, please contact Alicat." :-)

RickPattonAlicat commented 4 years ago

Hi dom-insytesys and patrickfuller, Alicat Test Engineer here! Still new to Git, so please excuse any newbie mistakes, but I do have some comments on the issues discussed above.

Regarding setpoint precision, the _set_setpoint() function here assumes all devices have 2-digit floating point precision, which may not be the case depending on device range or selected flow units. IMO, a reliable way of determining the precision of the setpoint is to simply get() the device state, and parse out the number of decimals, using that value to format your setpoint command string.

Regarding register map changes, all registers should be backwards compatible across the history of our firmware, though many registers have been added over the years, so higher registers (with similar functionality to lower registers) may not exist on older devices. Major changes occurred with a PCB hardware change starting with S/N# 80000 and beyond, where many of the perhaps more familiar ASCII commands were added. Prior to S/N# 80000, the highest register available was Register 79. Another major change occurred starting with firmware version 6vXX and beyond, and again a smaller addition with 7vXX.

TL;DR... self.control_point can be determined by querying Register 20 instead. On more recent firmware builds with Register 122, Register 122 and Register 20 are linked asynchronously, so changing one will change the other. Per this, I recommend redefining:

    # Mass Flow = +1024, Vol Flow = +768, Pressure = +256
    registers = {'flow': 0b0000010000000000,
                 'volume': 0b0000001100000000,
                 'pressure': 0b0000000100000000}

See changes to other functions and added _get_control_precision() below:

    def _set_setpoint(self, setpoint, retries=2):
        """Set the target setpoint.
        Called by 'set_flow_rate' and 'set_pressure', which both use the same
        command once the appropriate register is set.
        """
        self._test_controller_open()

        command = '{addr}S{setpoint:.{decimals}f}\r'.format(addr=self.address,
                                                   setpoint=setpoint,
                                                   decimals=self._get_control_precision())
        line = self._write_and_read(command, retries)

        # Some Alicat models don't return the setpoint. This accounts for
        # these devices.
        try:
            current = float(line.split()[-2])
        except IndexError:
            current = None

        if current is not None and abs(current - setpoint) > 0.01:
            raise IOError("Could not set setpoint.")

    def _get_control_point(self, retries=2):
        """Get the control point, and save to internal variable."""
        command = '{addr}R20\r'.format(addr=self.address)
        line = self._write_and_read(command, retries)
        if not line:
            return None
        value = int(line.split('=')[-1])
        try:
            return next(p for p, r in self.registers.items() if value & r == r)
        except StopIteration:
            raise ValueError("Unexpected register value: {:d}".format(value))

    def _get_control_precision(self, retries=2):
        """Get the precision of the control setpoint, and save to internal
        variable.
        """
        dataframe = self.get(retries=retries)
        if not dataframe:
            return None

        try:
            decimals = len(str(dataframe["setpoint"]).split('.')[1])
        except IndexError:
            decimals = 0
        return decimals

    def _set_control_point(self, point, retries=2):
        """Set whether to control on mass flow or pressure.
        Args:
            point: Either "flow" or "volume" or "pressure".
        """
        if point not in self.registers:
            raise ValueError("Control point must be 'flow' or 'volume' or 'pressure'.")

        # Get current device control state.
        command = '{addr}R20\r'.format(addr=self.address)
        line = self._write_and_read(command, retries)
        if not line:
            raise IOError("Could not detect current device control state.")
        curr_reg = int(line.split('=')[-1])

        # Subtract current control bitvalue; add new control bitvalue.
        reg = curr_reg - self.registers[self.control_point] + self.registers[point]
        command = '{addr}W20={reg:d}\r'.format(addr=self.address, reg=reg)
        line = self._write_and_read(command, retries)

        value = int(line.split('=')[-1])
        if value & reg != reg:
            raise IOError("Could not set control point.")
        self.control_point = point

Regarding flow units selected, this can be queried with an "FPF" command (5 = Mass Flow, 4 = Vol Flow, 3 = Temperature, 2 = Absolute Pressure). E.g.:

A FPF 5 returns "A [fullscale] [unitNumber] [unitLabel]", or in a specific example: "A 10.000 7 SLPM"

In fact, a more general solution to the setpoint precision and units question could be to create a UnitConverter class to convert between potential mismatches of input flow values to device flow values, and vice versa, depending on the current device configuration and whichever units are important to you (e.g. SLPM = SCCM/1000.0). Depending on whether or not you want to allow a user to change the device configuration after loading into your script, you can load the device configuration during __init__ or query device configuration before sending certain commands (like setpoint, so you know the current device flow units).

patrickfuller commented 4 years ago

@RickPattonAlicat thanks for replying! I think there's a lot we could do with this Alicat library (although it'd be a gradual nights-and-weekends project for me).

After having run this library for a few years on a variety of devices, I'm pretty convinced that the most stable route would be to directly read/write registers. The main serial command varies between devices (number of fields, length per field), which is fine most of the time but occasionally leads to misreads.

In my mind, the ideal driver would request a batch of data on __init__, including version, max flow, units, meter vs controller, selected gas, available gases. This would get cached and be accessible via __repr__ for interactive testing. If we have the max flow, we'd then be able to calculate and write the 16-bit setpoint register to the highest possible accuracy without changing the python API. Same with custom gases and better error messages for unavailable features.

Do you have more documentation on available registers? Also, is there a way to read multiple registers in one command? Happy to move this conversation to email or phone!

dom-insytesys commented 4 years ago

@RickPattonAlicat That's super useful information. Thank you!

I'll give the "FPF" command a shot to pull out the current units and full scale. If nothing else, that's preferable to what I'm doing now, which is to guess max flow and units based on the MFC model number. I assume the former is reliable, but latter could potentially be changed via front panel.

On the subject of usable precision, what is the downside of setting the desired mass flow setpoint as an integer? i.e. round(setpoint * max flow / 64000) It seems to me that this sidesteps any issue of how many digits of precision are appropriate.

RickPattonAlicat commented 4 years ago

@dom-insytesys This could be preferable in many circumstances. For example, our devices approximately pre-2018 would typically have a 10000 count max precision on the fullscale flow (with some exceptions depending on some customer ordering preferences). In most cases, you would get exact precision using the 64000 count register, but on some edge cases, you may have a 1-count rounding error. In general there can be some configurations with a not-so-round number of fullscale flow counts, or even an extra digit of decimal resolution (e.g. 10.000 SCCM), so some devices would result in a loss of precision. Additionally, with most devices since 2018, we've typically been able to expand the usable resolution by a full digit, so there may be a loss in precision in, for example, a 75.000 SLPM device, with a 64000-count setpoint resolution of 0.0012/count.

As a general practice, we like to encourage using ASCII commands where possible to avoid accidentally overwriting the wrong register. While that is less of a concern using a pre-built package like this, where the user isn't actively typing register values into a console, I still wanted to raise the point.

Another, less obvious reason would be to avoid excessive writes to the EEPROM. I believe our EEPROMs have a lifetime on the order of 1-million writes (don't quote me on that 😅 ), but if you're running a complicated script with a rapidly-varying setpoint, those writes to an EEPROM register could accumulate quickly. We have a register setting that can disable saving setpoints to the EEPROM to avoid burning it out in such a scenario.

So perhaps a more flexible solution would allow the 64000-scale setpoint with one function/property, and the floating-point setpoint in another function/property? I'll get with @patrickfuller over email to discuss further register map documentation, etc.

dom-insytesys commented 4 years ago

@RickPattonAlicat I don't really understand your reasoning: configuring the setpoint as integer doesn't involve explicitly writing to a register. According to page 42 of the flow controller manual, you just send the integer as ASCII digits. This seems to work as expected in my tests. Weirdly, there are no command characters at all, just a device address character, so in a way this is the default way of programming the controller. There doesn't seem to be any way to mess up and do the wrong thing.

I don't understand your resolution examples, either. If flow controller is a 75.000 SLPM device, I'm assuming that means that configuring setpoint as a float would set it to the nearest 0.001 SLPM. And as you observe, configuring setpoint as integer would round to the nearest 0.00117 SLPM, which is pretty much exactly the same thing.

Any sane application should not blindly rely on the flow (or pressure) setpoint being set exactly as configured. What we do is to read the MFC response, and use that as the recorded setpoint. Either way seems vulnerable to rounding errors.

On which subject, @patrickfuller, we ended up bypassing the set_flow_rate() command in numat/serial.py because it doesn't return the configured setpoint. It just raises an IOError exception if the difference between target setpoint and actual setpoint is outside an arbitrary tolerance. That's handy, but ideally (at least, the way we re-wrote the function) it would return either the raw response string, or a parsed dictionary, just as get() does. Since the function is already retrieving this information, there doesn't seem to be much downside.

If we hadn't modified your code to return the actual configured setpoint, I would have never noticed that our experimental errors were due to the setpoint being truncated much more than I expected.

patrickfuller commented 4 years ago

@dom-insytesys this is also something that's not consistent across alicats. Not all models return a full line on set (see comment).

It's bad practice to have a library that behaves differently on different device versions, so this library drops the response if it exists. The setpoint check was a nice bonus, allowing us to do something with the data without making the API ambiguous. (The other option would be to have old controllers run a silent get but that just moves the comm overhead to older devices).

Ideally, we'd be able to handle these quirks with a table of device versions, but, until then, we need to balance features with back compatiblity.

dom-insytesys commented 4 years ago

@patrickfuller That's a good point. But in a way, what the library does is already device-dependent. On devices that return a response, it silently performs a check on that response. Also, (and I don't have such a device to test) I assume that on older controllers, the _readline() spends a lot of time trying to read a response before timing out. So there's already a significant, unnecessary comms overhead.

In an ideal world, at instantiation, there would be a check of MFC capabilities, and for older devices, the _readline() would never be tried, and instead a get() or a specific "FPF" query would be used instead. Perhaps with an optional "no_check" flag to enable the user to skip this if they don't care.

Either way, _set_setpoint() already captures and parses the MFC response. And the current API doesn't return anything so if you changed the code to return the response in some form, it shouldn't break any existing code. And you could easily return None if the MFC is and old device that doesn't auto-respond with its new status.

@RickPattonAlicat: Is there a reliable way to query an Alicat MFC to find out if it will return a status response after a setpoint is configured? Also, is there a way to disable this feature? The reason I ask is that in our test rig, we have several MFC's that we would like to configure to a new setpoint as close to simultaneously as possible. Using Patrick's set_flow_rate() function, the _write_and_read() spends about 1 millisecond in the write() part, and then about 30-50 ms in the _readline() portion. In an ideal world, I would like to separate the two parts: i.e. send a new setpoint to all the MFC's, and only check their status after all the setpoints have been sent. At which point, if any didn't receive the setpoint correctly (a situation I have yet to encounter in my testing), the test runner software can re-issue the setpoint command. Right now, I've modified our code to work this way, but the subsequent get() commands often encounter communication errors, presumably because the MFC's are already trying to send back their status without waiting to be prompted to do so. My code retries the get() and that consistently succeeds the second time around, so ultimately the setpoints get confirmed, but it feels ugly to be dropping/mangling the automatic response. It would be nice either to disable the auto-response, or to issue the new setpoint with a different command that doesn't trigger the response.

RickPattonAlicat commented 4 years ago

@dom-insytesys Sorry, I think I misunderstood your original question regarding the 64000 setpoint command. I thought you were suggesting to write directly to the setpoint register via a AW[reg]=[value] command, rather than the A64000-type command, which does not write to EEPROM under the register setting I mentioned, and of course couldn't overwrite the wrong register.

Agreed, under the 0.00117/count example, it's "pretty much" the same thing, but not exactly the same thing. For example, at 75/64000 SLPM per count, it would be impossible to send a 0.003 SLPM setpoint, since A2 results in 0.002 and A3 results in 0.004. Across the full range in this example, there would be 11000 specific setpoint values that could not be reached by the command. Likely not an issue in most use cases, but could result in some frustration and an additional issue thread down the road.

With your multiple device setpoints question, if you're giving the same setpoint to all, you can use an asterisk to talk to all connected devices at once. Perhaps @patrickfuller would consider adding an ignore_response parameter in _write_and_read(), which would skip the line = self._readline() statement and just return None instead. Below is a function I use within my own in-progress API to send a generic command, along with various parameters for modifying its behavior:

    def _sendcommand(self, cmd, wait=0.075, verbose=0, read=1, flush=0, clean=1):
        """Sends command to device per alicat and serial formatting
        Waits a number of seconds specified by [wait].
        If verbose=1, print a copy of the command to the terminal.
        If read=0, do not attempt to read the buffer.
        If flush=1, reset the buffer after waiting [wait] seconds.
        If clean=0, do not clear the buffer prior to sending the command.
        """
        if "*" not in cmd:
            cmd = self._device_id + cmd

        if clean:
            self.ser.reset_input_buffer()
        if verbose:
            print(cmd)
        self.ser.write(cmd.encode('utf-8') + b'\r')
        time.sleep(wait)
        if flush:
            self.ser.reset_input_buffer()
        if read==0:
            return ''
        out = self.readbuffer() #defined elsewhere, allows for multi-line responses
        return out

For example, to achieve what you want, you could do:

for dut in duts:
    dut._sendcommand("S{}".format(setpoint), wait=0.001, read=0, clean=0)
time.sleep(0.075)
# Remember to clear the receiving buffer after all the setpoints are sent.
duts[0].ser.reset_inputbuffer()

I don't believe there is a "disable serial response" mode, but let me double-check.

RickPattonAlicat commented 4 years ago

@dom-insytesys confirmed, there's no way to disable the device from returning a dataframe in response to a setpoint command. I believe the best way to achieve this is either the asterisk command, e.g. *S1.234, or by modifying _write_and_read() to allow the user to specify a parameter to ignore the response, and just return to the parent as soon as the command is sent.

dom-insytesys commented 4 years ago

@RickPattonAlicat Thanks again for your assistance.

Sending same setpoint to all MFC's is, unfortunately, not an option. But it sounds like the reset_inputbuffer() might fix the garbled response to get(), which I assume is caused by all the MFC's returning their dataframes and my code not reading them.

patrickfuller commented 4 years ago

@dom-insytesys you're stuck with synchronous comm if you use Alicat's addressing, but you could always use a serial hub instead. Dedicated hubs would let you asynchronously request/respond. Some detail is here and this is how we run most of our systems.

numat / alicat