helena-project / imix

imix Low-Power IoT Research Platform
32 stars 6 forks source link

RF233/RNG enable browning out sam4l #12

Open alevy opened 7 years ago

alevy commented 7 years ago

From my test, enabling the RNG (by clearing PC19 on the SAM4L) browns out the SMA4L on a patched board of the most recent revision. @shaneleonard any suggestions on how to diagnose further? It might be worth you trying replicate as well, since it's definitely plausible I'm doing something wrong (I just enabled PC19 as an output then cleared it. If I set it instead, no problems--but also no RNG of course).

alevy commented 7 years ago

I actually think this may have happened because the my power supply (a passive USB hub) wasn't supplying enough power. It should be able to do up to 100uA, which very well might be less than the two MCUs + the RF233 + the RNG + the sensors on boot. I'm going to try and test this today.

alevy commented 7 years ago

OK, I'm able to run the RNG and continue with the OS as normal as long as I'm powering the board from two USB sources (currently the FTDI as well as power through the JLink). Ideally I would try this with an independently powered USB hub, but I don't have one on me. I should also try this with a large power supply...

However, this doesn't seem like great behavior if we can't power the board from a typical USB port.

@shaneleonard does this seem like my analysis is reasonable? If so, would it make sense to stick a current limiting resistor or something inline with the RNG power supply?

/cc @kwantam

ppannuto commented 7 years ago

I thought you had to be able to supply 500 mA to be USB compliant (even a "passive hub" ?).

Not that lowering the power draw is a bad idea, but it may also be good to have the FTDI chip negotiate for more power: http://www.ftdichip.com/Support/Documents/TechnicalNotes/TN_113_Simplified%20Description%20of%20USB%20Device%20Enumeration.pdf (from http://electronics.stackexchange.com/questions/5498/how-to-get-more-than-100ma-from-a-usb-port )

On Mon, Nov 21, 2016 at 3:51 PM Amit Levy notifications@github.com wrote:

OK, I'm able to run the RNG and continue with the OS as normal as long as I'm powering the board from two USB sources (currently the FTDI as well as power through the JLink). Ideally I would try this with an independently powered USB hub, but I don't have one on me. I should also try this with a large power supply...

However, this doesn't seem like great behavior if we can't power the board from a typical USB port.

@shaneleonard https://github.com/shaneleonard does this seem like my analysis is reasonable? If so, would it make sense to stick a current limiting resistor or something inline with the RNG power supply?

/cc @kwantam https://github.com/kwantam

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/helena-project/imix/issues/12#issuecomment-262063338, or mute the thread https://github.com/notifications/unsubscribe-auth/AAUt3qZqDxWoAinidm5SUxGmjrgZ8ks8ks5rAgRdgaJpZM4K3BMl .

shaneleonard commented 7 years ago

So the device actually regulating the power is the BQ24230, which is currently configured to be in USB 100 mA mode. By setting the EN1 pin high instead of low, it can be used for 500mA (the BQ24230 doesn't actually negotiate for more power--like Pat said, that's the FTDI's job--all it does is limit the power input). According to Ben, the RNG draws a 30 mA average current, but it looks like it actually spikes above that level. I would guess that the average current draw is <100mA but the peak current draw is >100mA.

So it looks like I should change EN1 on the BQ24230 to be high so that it limits the input current to 500mA instead of 100mA, and the SAM4L needs to get the FTDI chip to negotiate for more power. This also means that the RNG won't work with the native USB port (eg the non-FTDI one) until the Tock native USB driver is capable of device enumeration/negotiation for more power.

For now, there are a couple of workarounds. First, if the RNG only requires slightly more than 100mA, the BQ24230 might let it slide (it has some built in wiggle room for charging capacitive loads). It's possible that just setting the FTDI chip to negotiate for 500mA could be enough to fix the problem.

Otherwise, if you're able to directly power the 3V3 line with something capable of >100mA (at 3.3V, NOT 5V), then you are bypassing the regulator entirely, and so the input current limits won't apply. Note that it's necessary to power VCC_3V3, not just VCC_MCU_3V3 (what the JLink powers), because otherwise the RNG has no power.

shaneleonard commented 7 years ago

Also @ppannuto is right that every USB-compliant host must be capable of 500mA.

shaneleonard commented 7 years ago

Another possible workaround is to power the board from the battery input, because the BQ24230 doesn't limit the input battery current

ppannuto commented 7 years ago

I think in practice you can just set the BQ to 500 mA mode and everything will work out.

Very few USB ports actually limit downstream stuff to 100 mA nowadays.

On Mon, Nov 21, 2016 at 5:33 PM shaneleonard notifications@github.com wrote:

Another possible workaround is to power the board from the battery input, because the BQ24230 doesn't limit the input battery current

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/helena-project/imix/issues/12#issuecomment-262088978, or mute the thread https://github.com/notifications/unsubscribe-auth/AAUt3vimnWQCSVdnoKpe8-q7inazfXy_ks5rAhw-gaJpZM4K3BMl .

alevy commented 7 years ago

Just setting the FTDI to negotiate 500mA doesn't seem to do the trick (which seems like it's because of the BQ24230 limiting @shaneleonard pointed out).

It's true that USB-compliant hosts should be able to provide 500mA, but hubs are only required to provide 100mA. What is a hub? Undefined... the plugs on your desktop may be hubs, or they may be directly connected to the host.

Ideally (this is purely from a usability perspective), you should be able to run the board on any of the power sources, including from a non-FTDI USB. If it's possible to easily adapt the hardware to a 100mA power constraint (e.g. a current limiting resistor on the RNG or a capacitor to smooth out peak consumption) that's great. If not, as long as we have a software workaround that's probably good enough to move forward.

alevy commented 7 years ago

Is fixing the BQ something I can test myself?

alevy commented 7 years ago

BTW, @shaneleonard setting the FTDI to negotiate more power is done by programming the user-flash on the FTDI chip from a host machine (using a windows program called FT_PROG). I don't think it's possible from the SAM4L.

kwantam commented 7 years ago

As @alevy says, the right solution is to modify the dc/dc converter's design so that its inrush current is limited to less than 100 mA. I'll have to take a look at our schematic and the datasheet for the part we're using before I can make a specific suggestion, but in general there's no reason that keeping the converter's peak current below 100 mA should be tough. (The specific suggestions, resistor and capacitor, seem unlikely to me; but it depends on the specifics of the controller we're using. The most obvious guess would be a bigger inductor.)

Ideally, in the long term, I'd like to replace the controller we've got now with something cheaper. The TI product line we're using is optimized for "drop it in and don't worry," but we can easily do better in terms of cost and performance with a bit more thought. (I also have something of a bias against those parts because I designed competitors to them back in the day :).)

ppannuto commented 7 years ago

@alevy to test the BQ, need to set EN2=0, EN1=1 ( http://www.ti.com/lit/ds/symlink/bq24230.pdf ). EN1 is pin 6, which unfortunately will be a hard trace to cut: screen shot 2016-11-21 at 17 46 31

@shaneleonard one nice habit for the future is the eliminate traces between pads. For this reason, and also so because they tend to encourage solder bridges, which, while technically not wrong, make debugging board fab hard b/c you have to recognize that that bridge is okay.

@shaneleonard also, can you include the .brd and .sch files in the fab directories in the future? It's nice to have a snapshot of those so you can open a digital copy of exactly that rev of a board.

alevy commented 7 years ago

Oh, btw, when I feed enough power to the board I can successfully read random looking bits of the RNG. I'm not rigorously measuring time or anything, so the bits I'm getting are probably not exactly random, but definitely close enough to be sure the hardware does what it's supposed to do once we turn it on.

shaneleonard commented 7 years ago

@ppannuto Thanks for the tip, I had corrected the 'traces between pads' mistake on the SAM4L but had missed it on the BQ24230. Would it be valuable to have a DNP pullup/pulldown on EN1 and EN2, or is it better to just tie EN1 high?

shaneleonard commented 7 years ago

@ppannuto I'll make sure to do .brd and .sch snapshots moving forward

shaneleonard commented 7 years ago

@alevy Does feeding enough power like you have with the RNG solve the RF233 brown-out as well?

alevy commented 7 years ago

@shaneleonard it does not. still browns out unless I tie it to ground (actually even then it hard faults when I cut then restore power to the whole board usually, which maybe implies the jolt of current to the rf233 at boot is a bit too high or something)

ppannuto commented 7 years ago

Probably fine to just tie it. If you're really worried put it on a solder bridge or s/t

On Mon, Nov 21, 2016 at 18:19 shaneleonard notifications@github.com wrote:

@ppannuto https://github.com/ppannuto Thanks for the tip, I had corrected the 'traces between pads' mistake on the SAM4L but had missed it on the BQ24230. Would it be valuable to have a DNP pullup/pulldown on EN1 and EN2, or is it better to just tie EN1 high?

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/helena-project/imix/issues/12#issuecomment-262098794, or mute the thread https://github.com/notifications/unsubscribe-auth/AAUt3rfG_cgkLfAn6HEll1W80qwXW_Nvks5rAibxgaJpZM4K3BMl .

shaneleonard commented 7 years ago

@alevy From the RF233 datasheet, it seems that 11.8 mA is the highest power mode. There are two 1uF capacitors on VCC_RF233_3V3 (C5 and C13). To me it seems more likely that the current spike is from charging those capacitors than from powering up the chip itself. I could try removing C5 and C13 on one of my boards and testing it.

@kwantam @ppannuto does that sound like a reasonable theory?

My only doubt about all of this was that when I tested the RF233 issue before, I didn't see any sudden changes on the power line.

alevy commented 7 years ago

It could be a fluke. I tested this with a patched old board, which... Totally possible we (I) messed up the soldering. I'll try it with an unpatched one tomorrow

brghena commented 7 years ago

@shaneleonard I think it's pretty unlikely that the decoupling capacitors are doing anything. The time constants on 1 uF capacitors is pretty darn short. I'd believe that the rf230 briefly has a large current draw at startup though.

@alevy when you are testing this, how are you powering the board? Have you tried powering from a power supply with a really high current limit (amps) and seeing if there is still a brownout?

alevy commented 7 years ago

@brghena no I'm just powering from USB. I should power either from a desktop power supply or a battery (which @shaneleonard says is unrestricted on the board), but didn't get around to locating one today. Will tomorrow.

alevy commented 7 years ago

(@shaneleonard the current ratings in the datasheet are "typical", the "maximum" column is left blank...)

alevy commented 7 years ago

OK, I tested and got some not amazing but useful results. The high bit is that it seems to work when the supply voltage is between 1.9-2.2V. Anything lower didn't wasn't enough to power the MCU, anything higher caused what looks like a brown-out.

Setup

The program I'm running is the rf233 app, which sends a packet every 10 seconds over rf233, modified to also toggle the LED on each packet reception so it's easy to see what's happening.

I'm using the previous revision board--the one wit the messed up FTDI layout so the FTDI chip and USB port are removed. Shane and I patched it to hook up the decoupling capacitors for the SAM4L's VDDIO to VCC). PWR_RF233 is not tied to ground.

I plugged the negative and positive leads of a power supply into GND and 3.3v (on the arduino headers), respectively. I set the power supply to a fairly high current limit (~2amps). I ran the power supply at different voltages. Each time I change the voltage, I turned off and turned back on power from the power supply.

Expected results

There are two possible results:

  1. The chip blinks regularly every 10 seconds---this means the SAM4L got through initialization and is successfully running the RF233 app. It does not necessarily mean the packets were being sent since it's really only indicating successful completion of the SPI transactions to the RF233.

  2. The chip blinks irregularly and pretty fast. This is the hard fault handler. The stack trace from the hard fault handler has consistently shown a faulting PC that corresponds to an address between two instructions in the text segment, implying some sort of glitch (so... a brown out).

Results

With the voltage set in the range [1.90, 2.20], the LED blinks regularly every 10 seconds. This was the case consistently across multiple power cycles.

With the voltage set in the range [2.21, 3.30] the LED blinks irregularly and rapidly (hard fault). This, again, was the case across multiple power cycles. I didn't go above 3.3V.

Notes

shaneleonard commented 7 years ago

@alevy The only thing between VCC_3V3 and VCC_RF233_3V3 is the transistor controlling the power. From my understanding, the transistor has a very small voltage drop which should effectively be negligible (According to the datasheet, Rds(on) is about 40-70 milliohms for this operating range, so the voltage drop will be on the order of a couple of millivolts)

kwantam commented 7 years ago

It might be useful to watch (on a scope) what Vcc and the switch device's Vgs are doing in these modes.

@shaneleonard did we ever add a R-C circuit on the power switch gates to reduce the inrush current spike?

shaneleonard commented 7 years ago

@kwantam Unfortunately not :/ I thought I had a reason not to, like I remember testing the power supply with the brownout and not seeing any spikes in the power supply voltage, so I think I had either thought that the issue was solved with some other fix, or had run up against the SenSys deadline and the temporary fix was 'good enough'. In any case, we wanted an RC lowpass, right? There are gate resistors, so a cap could be soldered from the resistor to ground for a first-order filter

alevy commented 7 years ago

From my understanding, the transistor has a very small voltage drop which should effectively be negligible

In that case, I think it's a reasonable assumption that the RF233 was on and sending packets at 1.9-2.2V range.

Unfortunately I won't have a scope starting tomorrow until Monday when I get back from Seattle. However, I'd guess that using one of the newer boards @ppannuto & @shaneleonard patched should work at least for seeing an inrush current to the RF233.

If it's up to me, i'll need some help figuring out exactly the setup that @kwantam is proposing (I think it's a probes RF233_PWR_EN, VCC_RF233_3V3 and VCC_3V3).

shaneleonard commented 7 years ago

@alevy I don't have a scope with me until next Monday, but I do have a couple of boards that I could run simple tests with. You are correct about the setup--you'll specifically want to watch VCC_RF233_3V3 and VCC_3V3 right at the moment the PWR_EN is pulled low

shaneleonard commented 7 years ago

@kwantam Ah, shoot, I was wrong--there aren't gate resistors... Just pull-ups/pull-downs

kwantam commented 7 years ago

Putting a time constant on the gates of the power switches is probably a worthwhile addition to the list of TODOs for this rev, if there's still time to add it. Should only be a couple components.

In fact if we haven't yet committed to the contents of the next rev, maybe I should take a look at the dc/dc converter first. I anticipate only small tweaks, but better to have degrees of freedom if possible.

What's the schedule looking like for the board spin? I'll be around the office starting next Tuesday, but I'm available before then to work on this.

kwantam commented 7 years ago

@alevy the signals you've listed are the ones I was thinking of, but it might go more quickly if I work with @shaneleonard on this locally; I can do this sometime next week.

alevy commented 7 years ago

@kwantam even better! I'll try to get the code on a relatively clean branch somewhere so you can use it if you want.

Meanwhile, for timeline, we can spare a couple weeks. The plan is to try a crowd campaign so we can be a bit flexible on time. I think it's much better to be a bit safe than risk spinning another rev with serious problems if we can avoid it.

alevy commented 7 years ago

@shaneleonard & @kwantam how is debugging this going?

kwantam commented 7 years ago

Sorry, I dropped the ball on this amid other end-of-term activities.

I'm around Stanford over the break and I've got two other high priority deadlines before end of year, but I can spend a couple hours on this if @shaneleonard is around. Otherwise, first thing next year.

Sorry this didn't get done sooner.

shaneleonard commented 7 years ago

I'm back in Colorado, but I brought some equipment home with me so I could continue debugging. I'm finishing up testing the antenna issue first and then will be capturing some waveforms on the oscilloscope for the power issue to get some more concrete data.

shaneleonard commented 7 years ago

This plot demonstrates the power glitch. When the RF233 turns on (enable line brought low), the MCU power temporarily glitches and causes a brownout, confirming our theory.

powerglitchplot

Adding a passive RC filter to the RF233 enable line mostly fixes the issue:

passivefilterpatch

I say 'mostly' because the RF233 still doesn't fully turn off:

passivefilterrf233doesntfullydischarge

(close up of previous plot) rf233powerdischargecloseup

shaneleonard commented 7 years ago

I believe the issue is still that the enable line isn't all the way at the rail. I am going to try some different configurations with pullup resistors to see if I can get a working solution.

ppannuto commented 7 years ago

I think it might be easier to move from the current bare transistor design to a dedicated power IC. We use http://www.vishay.com/docs/63705/sip32401a.pdf on several of the signpost boards without issue. Relevant to the immediate issue:

They feature a controlled soft-on slew rate of typical 2.5 ms that limits the inrush current for designs of heavy capacitive load and minimizes the resulting voltage droop at the power rails.

I feel like you're slowly re-inventing the circuit inside that chip (or something similar)

shaneleonard commented 7 years ago

I think you're right, this would be a good way to go. Also, having active-high logic on the enable pins helps with the solution Riad and I worked out for the isolation issue as well. Since my next revision is a small run, I'll try it with the power IC and make the switch if it works. At the very least, I know now for sure that the issue was indeed a brownout, and the solution was limiting the slew-rate.

kwantam commented 7 years ago

The concern with using a part like the SIP32401 is that it's overkill and ends up increasing the BOM cost (the transistor plus resistors should be a couple pennies, while the above cited part is more like 15 cents at thousand quantity). This is not a particularly hard problem to solve with discretes. But if board space or simplicity is more important than cost, by all means go for it.

kwantam commented 7 years ago

Shane, do you have a schematic of the current setup? It's not clear to me why there would be any trouble getting the gate all the way to Vdd.

ppannuto commented 7 years ago

$0.02 vs $0.15 for components on a $100 Board is in the noise. We've sunk a ton of time into diagnosing power issues on this board already. A known good solution is way more valuable IMO at this point.

On Mon, Feb 13, 2017 at 4:25 PM Riad S. Wahby notifications@github.com wrote:

Shane, do you have a schematic of the current setup? It's not clear to me why there would be any trouble getting the gate all the way to Vdd.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/helena-project/imix/issues/12#issuecomment-279527149, or mute the thread https://github.com/notifications/unsubscribe-auth/AAUt3mcWerrAFkuGz8596HqCeOiY01dwks5rcMorgaJpZM4K3BMl .

kwantam commented 7 years ago

Like I said, other things might take priority and spending the extra money could be the right decision in this case. Still, you must be aware that this maxim does not universalize (unless you're excited for a $100 board to become a $500 board).

In any case, there's no logical connection from "we've spent a lot of time diagnosing" to "therefore use a new part." There would be a connection if the time had been spent debugging a fix, but that's not the case---or if it is, I'm not sure why, since I pointed out that this was the issue and suggested a correct fix months ago.

ppannuto commented 7 years ago

Sure, going from $1 -> $15 is not the same. My argument is that going from $0.01 -> $0.15 for something in quantity ~5 should not even be brought up as a discussion for this board.

The connection is that Shane has tried a fix, and that fix did not work. Rather than continuing to try more fixes, let's take the part that is known to work correctly for this exact application and move on.

On Mon, Feb 13, 2017 at 5:15 PM Riad S. Wahby notifications@github.com wrote:

Like I said, other things might take priority and spending the extra money could be the right decision in this case. Still, you must be aware that this maxim does not universalize (unless you're excited for a $100 board to become a $500 board).

In any case, there's no logical connection from "we've spent a lot of time diagnosing" to "therefore use a new part." There would be a connection if the time had been spent debugging a fix, but that's not the case---or if it is, I'm not sure why, since I pointed out that this was the issue and suggested a correct fix months ago.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/helena-project/imix/issues/12#issuecomment-279540810, or mute the thread https://github.com/notifications/unsubscribe-auth/AAUt3mVtostgqCQY-za2Yajxwnkqewc6ks5rcNYPgaJpZM4K3BMl .

shaneleonard commented 7 years ago

I take the blame for the speed at which this fix has been implemented, as most of my time personally has been spent waffling on this very decision on whether to focus on a discrete fix or use an IC, and not very efficiently switching focus between the serial stuff/other power issues. All said and done, my intention in running these recent tests was to see for certain that our hypothesis was correct without spinning another board. I also wanted to fully characterize the discrete solution in order to evaluate it compared to another approach.

Therefore, I think the best course of action right now is to try the SIP32401 for the next small run, as it reduces the routing complexity in multiple ways and is likely to be a final solution. I can easily revert back to debugging the discrete approach if that's a total failure, so no harm done. Going down the discrete path gave me valuable insight into the issue, so I think it's fulfilled a purpose. Honestly much of the reason I've been slow to execute this has been my lack of experience in making decisions like this--I'm truly thankful for everyone's input, but it's also been a learning curve for me figuring out how to weigh all the conflicting opinions. In this case, I think BOM cost should be tackled as a separate concern, one which requires a much more holistic look at the board.

shaneleonard commented 7 years ago

I will say that for me, routing complexity takes fairly high precedence here, because on a breakout board like this where basically everything is broken out, there isn't much flexibility.

alevy commented 7 years ago

@kwantam and @ppannuto it sounds like we all agree that there is a case for saving on the BOM and that it may apply here, or it may not. It certainly seems like for this next revision, where we're spinning only a relatively small number of boards, we should do whatever is most expedient for Shane to get right.

Especially in the question between a discrete IC and a handful of passives that would do the same thing, we can always go back and change our minds if the unit ecomonics cause to change our minds, since the board functionality ought to be basically the same.

I don't know what the right answer is, but it seems like @shaneleonard should just do the one he is most confident in for now.