sinara-hw / Booster

Modular 8-channel RF power amplifier
Other
16 stars 3 forks source link

TVS testing #350

Closed hartytp closed 4 years ago

hartytp commented 4 years ago

Testing a pair of Booster v1.5 with all three amp chips replaced and TVS added by TechnoSystem. https://github.com/sinara-hw/Booster/issues/344

Firmware is the latest emailed to me by CTI (would be good to get this back under version control!).

First test: power cycle repeatedly while looking at the bias currents to check for odd hysteresis.

Version(fw_rev='v1.4.1', fw_hash='d0f3d02', fw_build_date=datetime.datetime(2019, 12, 24, 12, 49, 18), device_id='30374705', hw_rev='hw rev 1.4')

0 1 2 3 4 5 6 7
0.053599112 0.049663515 0.049850923 0.047976828 0.050975378 0.050600562 0.052662065 0.051162786
0.05378652 0.050038331 0.050413154 0.048351644 0.051350201 0.051162786 0.053036884 0.052099834
0.05378652 0.050038331 0.050413154 0.048351644 0.051162786 0.051162786 0.053036884 0.051912425
0.05378652 0.050038331 0.050413154 0.048351644 0.051162786 0.050975378 0.052849473 0.051912425
0.05378652 0.050038331 0.050413154 0.048164236 0.051162786 0.050975378 0.052849473 0.051912425
0.053599112 0.050038331 0.050225739 0.048164236 0.050975378 0.050975378 0.052849473 0.051725017
0.053599112 0.049850923 0.050225739 0.048164236 0.050975378 0.050975378 0.052662065 0.051725017
0.053599112 0.049850923 0.050225739 0.048164236 0.050975378 0.05078797 0.052662065 0.051725017
0.053599112 0.049850923 0.050225739 0.047976828 0.050975378 0.05078797 0.052662065 0.051725017
0.05378652 0.050038331 0.050225739 0.048164236 0.051162786 0.050975378 0.052849473 0.051725017
0.05378652 0.050038331 0.050413154 0.048164236 0.051162786 0.050975378 0.052849473 0.051725017
0.05378652 0.050038331 0.050225739 0.048164236 0.050975378 0.050975378 0.052662065 0.051725017
0.05378652 0.049850923 0.050225739 0.047976828 0.050975378 0.05078797 0.052662065 0.051537609
0.053599112 0.049850923 0.050225739 0.047976828 0.050975378 0.05078797 0.052662065 0.051537609
0.053599112 0.050038331 0.050225739 0.047976828 0.050975378 0.05078797 0.052662065 0.051537609
0.053599112 0.049850923 0.050225739 0.047976828 0.050975378 0.05078797 0.052662065 0.051537609
0.053599112 0.049850923 0.050225739 0.047976828 0.050975378 0.05078797 0.052662065 0.051537609
0.053599112 0.049850923 0.050225739 0.047976828 0.05078797 0.05078797 0.052474657 0.051537609
0.053599112 0.049850923 0.050225739 0.047976828 0.05078797 0.05078797 0.052474657 0.051537609
0.053599112 0.049850923 0.050225739 0.047976828 0.05078797 0.05078797 0.052662065 0.051537609
0.053599112 0.049850923 0.050225739 0.047976828 0.050975378 0.05078797 0.052662065 0.051725017
0.053599112 0.049850923 0.050225739 0.047976828 0.050975378 0.05078797 0.052662065 0.051725017
0.053411704 0.049476106 0.049663515 0.04778942 0.05078797 0.050413154 0.052287249 0.05078797

Version(fw_rev='v1.4.1', fw_hash='d0f3d02', fw_build_date=datetime.datetime(2019, 12, 24, 12, 49, 18), device_id='34364708', hw_rev='hw rev 1.3')

0 1 2 3 4 5 6 7
0.049645106 0.050450162 0.046290707 0.051926097 0.049779282 0.051255217 0.051255217 0.051791921
0.049913458 0.050584338 0.046156531 0.052328625 0.049779282 0.051523569 0.051389393 0.051255217
0.04951093 0.049645106 0.045888179 0.049645106 0.049645106 0.05018181 0.04951093 0.050718514
0.04951093 0.04951093 0.045619827 0.049108402 0.049645106 0.050047634 0.049242578 0.050584338
0.04951093 0.049376754 0.045619827 0.048974226 0.04951093 0.049913458 0.049108402 0.050450162
0.049376754 0.049242578 0.045485651 0.04884005 0.049242578 0.049779282 0.048974226 0.050450162
0.049376754 0.049242578 0.045619827 0.048571698 0.049242578 0.049913458 0.048974226 0.050315986
0.051389393 0.049108402 0.045485651 0.048571698 0.048974226 0.049779282 0.048705874 0.050315986
0.049242578 0.048974226 0.045485651 0.04816917 0.049108402 0.049645106 0.048705874 0.050315986
0.04951093 0.049913458 0.045619827 0.049645106 0.047229939 0.050718514 0.050718514 0.050852689
0.049376754 0.049376754 0.045351475 0.048705874 0.048974226 0.049779282 0.048974226 0.050450162
0.049242578 0.049108402 0.045351475 0.048437522 0.04884005 0.049645106 0.04884005 0.050450162
0.051389393 0.048974226 0.045217299 0.048437522 0.04884005 0.049645106 0.048705874 0.050315986
0.049242578 0.048974226 0.045217299 0.048303346 0.048974226 0.049645106 0.048571698 0.05018181
0.049242578 0.048974226 0.045217299 0.048303346 0.048705874 0.049645106 0.048571698 0.05018181
0.049242578 0.048974226 0.045083124 0.04816917 0.048705874 0.04951093 0.048705874 0.050315986
0.049108402 0.048974226 0.045083124 0.048303346 0.048705874 0.04951093 0.048571698 0.05018181
0.049242578 0.048974226 0.045217299 0.048303346 0.048705874 0.04951093 0.048571698 0.050315986
0.049108402 0.04884005 0.045083124 0.04816917 0.048705874 0.04951093 0.048571698 0.050047634
0.049242578 0.04884005 0.044948948 0.048034994 0.04884005 0.049645106 0.048303346 0.049779282
0.049242578 0.048974226 0.044948948 0.048303346 0.048705874 0.04951093 0.048437522 0.050047634
0.049242578 0.049242578 0.044948948 0.048571698 0.048571698 0.049779282 0.04884005 0.050047634
hartytp commented 4 years ago

Here are those same numbers converted into mA and with mean subtracted for readability...

0 1 2 3 4 5 6 7
-0.06 -0.24 -0.37 -0.10 -0.02 -0.25 -0.04 -0.47
0.13 0.14 0.19 0.28 0.35 0.31 0.33 0.46
0.13 0.14 0.19 0.28 0.16 0.31 0.33 0.28
0.13 0.14 0.19 0.28 0.16 0.12 0.15 0.28
0.13 0.14 0.19 0.09 0.16 0.12 0.15 0.28
-0.06 0.14 -0.00 0.09 -0.02 0.12 0.15 0.09
-0.06 -0.05 -0.00 0.09 -0.02 0.12 -0.04 0.09
-0.06 -0.05 -0.00 0.09 -0.02 -0.07 -0.04 0.09
-0.06 -0.05 -0.00 -0.10 -0.02 -0.07 -0.04 0.09
0.13 0.14 -0.00 0.09 0.16 0.12 0.15 0.09
0.13 0.14 0.19 0.09 0.16 0.12 0.15 0.09
0.13 0.14 -0.00 0.09 -0.02 0.12 -0.04 0.09
0.13 -0.05 -0.00 -0.10 -0.02 -0.07 -0.04 -0.10
-0.06 -0.05 -0.00 -0.10 -0.02 -0.07 -0.04 -0.10
-0.06 0.14 -0.00 -0.10 -0.02 -0.07 -0.04 -0.10
-0.06 -0.05 -0.00 -0.10 -0.02 -0.07 -0.04 -0.10
-0.06 -0.05 -0.00 -0.10 -0.02 -0.07 -0.04 -0.10
-0.06 -0.05 -0.00 -0.10 -0.21 -0.07 -0.23 -0.10
-0.06 -0.05 -0.00 -0.10 -0.21 -0.07 -0.23 -0.10
-0.06 -0.05 -0.00 -0.10 -0.21 -0.07 -0.04 -0.10
-0.06 -0.05 -0.00 -0.10 -0.02 -0.07 -0.04 0.09
-0.06 -0.05 -0.00 -0.10 -0.02 -0.07 -0.04 0.09
-0.24 -0.42 -0.56 -0.29 -0.21 -0.44 -0.42 -0.85
0 1 2 3 4 5 6 7
0.10 1.16 0.88 3.03 0.81 1.34 2.15 1.37
0.37 1.30 0.75 3.43 0.81 1.61 2.29 0.83
-0.03 0.36 0.48 0.74 0.67 0.27 0.41 0.29
-0.03 0.23 0.21 0.21 0.67 0.13 0.14 0.16
-0.03 0.09 0.21 0.07 0.54 0.00 0.01 0.02
-0.16 -0.04 0.08 -0.06 0.27 -0.13 -0.13 0.02
-0.16 -0.04 0.21 -0.33 0.27 0.00 -0.13 -0.11
1.85 -0.18 0.08 -0.33 -0.00 -0.13 -0.40 -0.11
-0.30 -0.31 0.08 -0.73 0.13 -0.27 -0.40 -0.11
-0.03 0.63 0.21 0.74 -1.74 0.81 1.62 0.43
-0.16 0.09 -0.05 -0.20 -0.00 -0.13 -0.13 0.02
-0.30 -0.18 -0.05 -0.46 -0.13 -0.27 -0.26 0.02
1.85 -0.31 -0.19 -0.46 -0.13 -0.27 -0.40 -0.11
-0.30 -0.31 -0.19 -0.60 -0.00 -0.27 -0.53 -0.24
-0.30 -0.31 -0.19 -0.60 -0.27 -0.27 -0.53 -0.24
-0.30 -0.31 -0.32 -0.73 -0.27 -0.40 -0.40 -0.11
-0.43 -0.31 -0.32 -0.60 -0.27 -0.40 -0.53 -0.24
-0.30 -0.31 -0.19 -0.60 -0.27 -0.40 -0.53 -0.11
-0.43 -0.45 -0.32 -0.73 -0.27 -0.40 -0.53 -0.38
-0.30 -0.45 -0.46 -0.87 -0.13 -0.27 -0.80 -0.65
-0.30 -0.31 -0.46 -0.60 -0.27 -0.40 -0.66 -0.38
-0.30 -0.04 -0.46 -0.33 -0.40 -0.13 -0.26 -0.38
hartytp commented 4 years ago

hmm....it still strikes me as odd that we see the occasional large variation in bias current on some channels. @gkasprow what do you think?

NB the script I used for this was basically

def main(args):
    ip = args.dev
    dev = Booster("10.255.6."+ip)

    print("{}".format(dev.get_version()))
    dev.close()

    print(" | ".join([str(chan) for chan in range(8)]))
    print(" | ".join(["---"]*8))

    while True:
        try:
            dev = Booster("10.255.6."+ip)
        except:
            time.sleep(1)
            continue

        time.sleep(1)

        for chan in range(8):
            dev.set_enabled(chan, True)

        time.sleep(1)
        currents = [str(dev.get_current(chan)) for chan in range(8)]
        print("|".join(currents))

        while True:
            try:
                dev.get_version()
                time.sleep(1)
            except:
                break

Given the long delays I would have expected everything to have settled to much better than 1mA...

hartytp commented 4 years ago

Quick log of the bias currents over time. Will post more data on Monday (as much as I get before the sw crashes)...

image

image

hartytp commented 4 years ago

@gkasprow what do you think those large bias current spikes are? Glitches reading out the ADC? Or, actual large variations in the bias current? That's pretty huge so if it's real it's a bit scary, no?

On the v1.4 HW the InAmp output is 43V/V*0.1Ohm = 4.3V/A. So, looking for 1mA spikes in the bias current is going to be tough (last time I tried naively doing this with a scope there was too much pick up to see a 1mV signal reliably).

@gkasprow how do you suggest we debug this?

hartytp commented 4 years ago

This means we definitely still see https://github.com/sinara-hw/Booster/issues/325 even on a newly refurbished booster with the TVS added (I haven't even applied RF yet).

Based on previous experience, my suspicion is that this is a measurement glitch since I do not recall seeing gain/output power variations accompanying these glitches. I'll gather more statistics overnight and then double check whether the bias current noise does have corresponding gain/output power variations or not.

NB it's hard to say anything conclusive about the bias current hysteresis between power cycles while the noise statistics are non-stationary...

hartytp commented 4 years ago

A bit more data that clearly shows the problem...

image

image

One comment: the booster that shows the problem much worse is a V1.3 the one that seems better is a 1.4. There were various changes between the two revisions, such as a new InAmp and fixing Vcc for the ADC. Not sure if these are related or not but, if they are, they don't appear to have fixed the issue entirely...

hartytp commented 4 years ago

One final observation...I changed my logger to use the chan:diag (rather than just querying the 30V current directly). I now see something qualitatively different.

image

image

This really does start to feel like a fw bug rather than a hw bug.

hartytp commented 4 years ago

Also recorded the 5VMP rail.

image

Will post more data on Monday, but this definitely feels like an issue either with the ADC or with the firmware. The fact that both bias rails and the 5VMP show similar glitches strongly suggests that we're looking at a measurement artifact.

hartytp commented 4 years ago

The statistics of the glitches also seems very non-stationary and potentially changes every time I open and close the ethernet connection (although it's hard to be sure about that and I have weak statistics at best for that claim...)

jordens commented 4 years ago

https://github.com/sinara-hw/Booster/issues/329#issuecomment-562508365

I'd really like to be able to sell these, develop on them, and provide integration, support, and features. But it currently looks like both quality and management of software are not where we need them. And I'm uncertain what the roadmap is. What options do we have? Wait and hope? Open up the software and review it properly? Reimplement? Fork?

hartytp commented 4 years ago

329 (comment)

Sure, I recall that.

But it currently looks like both quality and management of software are not where we need them. And I'm uncertain what the roadmap is.

Indeed.

What options do we have? Wait and hope? Open up the software and review it properly? Reimplement? Fork?

Creotech have recently started engaging with the software and have made some progress fixing the bugs.

The timeline is this:

We will install these two patched Booster in our experiments next week (Monday hopefully) while we send another couple back for patching. In the coming weeks we will have ~50 channels installed in experiments.

The hope is to get good statistics to back up the claim that the hardware issues are now fixed with Booster and all that remains is the software (although that would be easier if the SW diagnostics worked correctly).

Once we reach that state it will become easier to drum up the resources to find a long-term solution to the software issues. While CTI are putting in a valiant effort on the software, my last review of the code-base made me somewhat uncomfortable with the entire way it's structured. Very much old-school C-code, overly complex and fragile with almost no documentation. If the firmware isn't working without issue by the time we finish testing the hw, then we will consider cutting our losses and e.g. funding a rust revision (although, this is something which will be easier if a consortium of users can pool together to help funding). I'd like to reach a baseline of functionality before we try this so that we have some confidence in the claim that with well-written code the device will function as intended.

hartytp commented 4 years ago

AFAICT, the loggers didn't crash overnight. Here is the data

image

same, but removing points on the 5VMP rail with very large errors.

image

same, but zooming in on the start

image

Anyway, as discussed before, I think the most likely conclusion here is that this booster works fine hardware-wise, but there are some software glitches that mess up the diagnostics.

gkasprow commented 4 years ago

Thanks for doing the measurements. I had short holidays connected with flu; I'm back to life now. Indeed, it looks like a firmware issue. I observed the I2C ADC inputs with precise DMM and did not see any variations that could explain the observations. It looks like I have to spend another week or so studying deeply the ADC behavior.

jordens commented 4 years ago

Good to have you back!

hartytp commented 4 years ago

Welcome back @gkasprow

It looks like I have to spend another week or so studying deeply the ADC behavior.

Do you have HW to do that? Can you do this next week? I really want to get rid of these final issues and make Booster a nice product to use...

gkasprow commented 4 years ago

I have the chassis and two modules

hartytp commented 4 years ago

I have the chassis and two modules

Is that the one of ours that CTI had?

When do you think you can look at this issue?

gkasprow commented 4 years ago

this is brand new from TS. I will have a look on Tuesday. I need @wizath help.

hartytp commented 4 years ago

So the conclusion then is that -- modulo frustrating mechanical issues which we will resolve in v1.5 and frustrating ADC/firmware issues that @gkasprow and @wizath will resolve next week -- we are so far not aware of any issues with Booster after fitting TVS. Booster with TVS are now in live experiments so any issues should be uncovered in short order. Working diagnostics (getting rid of ADC glitches and making sure that the SCPI interface don't cause hard faults) would make it much easier to verify performance.

@gkasprow any updates on how Creotech are getting on? They still haven't sent me a Booster with TVS now (they've been working on that for over a month, any idea what's taking so long) and haven't responded to any of my emails about firwmare bugs...

gkasprow commented 4 years ago

Bartek is working on the ADC issue, but he has another Booster to play with. They will ship one with TVS ASAP.

hartytp commented 4 years ago

Perfect, thanks!

hartytp commented 4 years ago

Having fixed the screws, we haven't noticed any more suspicious behaviour on the 16 patched channels in our experiment. If all still looks good in a few weeks we'll prepare v1.5 for manufacture.

@gkasprow any update on the TVS-patched Booster CTI are preparing for us? They've been doing that for a few weeks now, is there a reason it's taking so long??

hartytp commented 4 years ago

@dnadlinger what's your feedback from the two patched Boosters you have? Have they been heavily used without issue? We should get another patched one from TS tomorrow. Now that we have firmware with reliable ADC readings, I'll have another go at fuzzing it and see if it can survive

hartytp commented 4 years ago

We now have decent statistics to back up the claim that with the latest round of fixes (including TVSs) Boosters are indeed reliable.

Now the hardware has stabilized we should do re-characterise it, but I'll leave that for another issue/day