partyrobotics / bartendro

GNU General Public License v2.0
170 stars 89 forks source link

Intermittent Proper Dispenser Detection #157

Closed pmich closed 9 years ago

pmich commented 9 years ago

Someone has mentioned that their bot does not detect all their dispensers 100% of the time. Out of 15, sometimes it detects 14, or 13, or 12. They have to restart many many times for the bot to detect all 15, but in the end it usually works. After much experimentation, I've actually been able to recreate the bug on a brand new router board. It is still intermittent in nature, but it happens more often than not. My setup is B15 with 2 v3 dispensers with IDs: B3 and AB. The first symptom is that on power up the dispensers flash blue 3 or 4 times, then turn red. However, the dispensers always get into the color fade mode once the bot is booted, so that's confusing because it seems like everything is ok. Once I go into the debug menu I see that the the IDs are conflicting and that dispenser v2 was detected. image The dispensers work fine on another board, and if plugged in individually, their ID value is more often read correctly in the debug screen. If both dispensers are plugged in, it seems that about 30% of the time both dispenser IDs will be read correctly, otherwise they will display a conflict with ID 63 and v2 detection. I decided to capture with a logic analyzer to see what was going on. I've reduced down to a single dispenser, and this is what it looks like on startup (orange is TX from Router, red is TX from dispenser): After a faulty start, this is what it looks like on when I hit 'reset dispensers' from the dispenser page: image This is what it looks like when I hit 'test dispensers' in the debug page: image On boot (zoomed on the very first thing): image On boot (zoomed all the way out): image

After a power cycle, I got a different result. On boot it actually got one of the IDs correct: image The is what 'test dispenser' from the debug page produces: image

I know we had a software bug at one point that was timing related, but I don't know if this is similar, I'd love to get your thoughts.

Other than chucking this in the dead router board pile, is there some test to further verify the state of the router board and know that it really is bad, and more importantly, how do we test for a problem like this before we ship hardware out? Should / could we make the dispensers display a different pattern if their ID is never read correctly by the router?

mayhem commented 9 years ago

Can this problem ever be duplicated with an older router board and new dispensers?

Somehow, I don't trust the new router boards.

partyrobotics commented 9 years ago

I'm still slacking on finding and recovering an old router board. I have managed to reproduce the failures on multiple new router boards though, which is pretty scary.

On Fri, Sep 19, 2014 at 3:42 AM, Robert Kaye notifications@github.com wrote:

Can this problem ever be duplicated with an older router board and new dispensers?

Somehow, I don't trust the new router boards.

— Reply to this email directly or view it on GitHub https://github.com/partyrobotics/bartendro/issues/157#issuecomment-56162134 .

mayhem commented 9 years ago

This is exactly why I am suggesting the board-by-board comparison. I don't trust the new router boards.

pmich commented 9 years ago

This issue has been resolved. The dispensers were found to be entering into text-mode inappropriately. A dispenser firmware update requiring 3 consecutive '!' to get into text mode did the trick.

robotskirts commented 8 years ago

My machine is exhibiting this behavior. How do I update the dispensers?

partyrobotics commented 8 years ago

Hi Eliot,

Please follow the instructions here: http://support.partyrobotics.com/Guide/Re-programming+a+Dispenser/13

Best, -Pierre

On Fri, Oct 9, 2015 at 1:22 PM, Eliot notifications@github.com wrote:

My machine is exhibiting this behavior. How do I update the dispensers?

— Reply to this email directly or view it on GitHub https://github.com/partyrobotics/bartendro/issues/157#issuecomment-146977893 .

nighteagle1974 commented 7 years ago

Hello,

so the last dispenser-firmware linked in the above post is older than the post from pmich tell us the trick is to have 3 consecutive "!" to get into Text-Mode. I have compared the Sources from 2015 and 2016 all the same except the LED-Swapping and Software-Revision-ID.

I have the same behaviour with the newer Firmware from Oktober 2016. I have compiled with the raspi2 and avr-gcc and avrdude and program the dispenser are fine. Set ID also fine... Solder Jumper to Version 4 and boot bartendro pumps are recognized as Version 4.. only on the first Slot. I use for testing only two pumps on Slot 1 and Slot2 both Revision 4 with firmware from Git. But often the Pumps are not recognized in the Bartendro-Debug Window, when i press button dispense 10ml under Dispenser Menue then sometimes the pump do not start and Error 502 come back. In debug window i see Dispenser V2 detection of first pump. If i change the pumps the next time also first pump detect V2. So here is an bug by detection the First Pump? Slot 1?

To prevent Text-Mode swapping ( we only need the binary-Mode for communicate with the router?) can i change something on the dispenser-sources? Deactivate Text-Mode?

Regards

nighteagle1974 commented 7 years ago

Hello again,

so i have a look into the Sources of the Router.

  1. I see the Baudrate is not correct calculated:

    define UBBR (F_CPU / 16 / BAUD - 1)

    Here 52,08 See in Datasheet Atmel: ((F_CPU + UART_BAUD_RATE 8L) / (UART_BAUD_RATE 16L) - 1) Here 51,58

  2. Router.c For what you use the g_reset ? I see is writing High and Low but is not use for some logic? Then if we don't use the g_reset we don't need this: `cli(); reset = g_reset; sei();

        if (reset)
        {
            cli();
            g_reset = 0;
            sei();
            break; 
        }`

Because we break the ISR for an nonsense writing of g_reset ?

I see the echo_dispenser is call by an ISR. So on the beginning of the ISR the reti is disable ISR and enable ISR at the end. If you call the echo_dispenser in the ISR: ISR(PCINT0_vect) { echo_rpi(); echo_dispenser(); }

There is no break of these function like you wrote here: `void echo_dispenser(void) { volatile unsigned char *group; uint8_t pin, state;

// capture a local copy of g_dispenser to guarantee the next two lines use the same value
// (interrupt could come between the two and change g_dispenser, I think? Depends on interrupt
// rules. I don't *think* this is an issue, but it would be a rare enough failure case that
// it would be near impossible to debug.)
uint8_t disp = g_dispenser;

group = dispenser[disp].group;
pin = dispenser[disp].pin;

state = *group & (1 << pin);

if (state)
// Bit 0 go High
    sbi(PORTB, 0); 
else
// Bit 0 go Low
    cbi(PORTB, 0);

} ` I think we don't need write globale variable to an locale variable into an ISR-called Function.

What your thoughts about my Points?

nighteagle1974 commented 7 years ago

Looks like the Code-Comment is not working correct here on the Github?

mayhem commented 7 years ago

g_reset is used to reset the each pump on the bot -- it is a critical function. A message comes in via I2C to reset the bot which gets acted on here:

https://github.com/partyrobotics/bartendro/blob/master/firmware/router/router.c#L240

Why do you think the ISR is broken by writing to a variable? The variable is defined as volatile, so I see no problems with it. Have you tried any of the changes you are suggesting? Do they improve things?

nighteagle1974 commented 7 years ago

No i don't think ISR is broken. The Programmer write here: // capture a local copy of g_dispenser to guarantee the next two lines use the same value // (interrupt could come between the two and change g_dispenser, I think? Depends on interrupt // rules. I don't think this is an issue, but it would be a rare enough failure case that // it would be near impossible to debug.)

But does is not correct because the ISR is deactivated bei reti from CPU, you don't need to write the global to an locale in an ISR-Called function.

I see on the layout each Pump have an Hardware-Reset-Input comming from the Router. Connected to the Router Hardware-Reset-Pin. But in the Source from the Router the g_reset is not reading only writing. data = TWDR; if (data == ROUTER_CMD_RESET) g_reset = 1;

So i don't understand what you do with the global variable g_reset.

I will first becomes your thoughts about this Points and then i will compile sources. The Baudrate Problem is one of that. On the baudcalculators on the Web you have also the correct value 51 for 8Mhz and 9600 Baud. Your calculations give an value of 52 and 1% more Error Rate. So we have 1,2% in the End. Problem with the 3.3V Pegel and long Cables 0,5m without Line-Driver we have problems with communications over Uart.

nighteagle1974 commented 7 years ago

Okay.. i see now more..

You have an For-Loop to test reset-variable.. if there is High you break the Loop and go to one Level higher and there is the reset-dispenser functions. Okay i understand.

mayhem commented 7 years ago

"3.3V Pegel" bissu deutsch? :)

long Cables 0,5m -- yes, I've heard of this problem before of bots that do not use Bartendro cases (ergo longer cables are needed). I would love to hear if this makes any difference in communication reliability. Also, do you have problems while the motors are running (interference) or also when motors are not running?

nighteagle1974 commented 7 years ago

Ja Deutsch.. meinte 3,3V Level :-)

mayhem commented 7 years ago

Klar, schon kapiert. :)

nighteagle1974 commented 7 years ago

Yes have Problems like pierre wrote above. I 'm looking now for.. first 3,3V level is an problem by cables longer 0,5m. Sometimes Pumps not recognized correct. I have to restart bot many times. Actually i will check communication with logic analyzer.

nighteagle1974 commented 7 years ago

Toll, wenn ihr alle Deutsch seid. Warum sieht man das nicht :-)

mayhem commented 7 years ago

Ich bin deutsch, aber auch nur der einzige. :) Und ich habe auch zu lange in Kalifornien gelebt...

I would love to hear if you can find problems on the logic analyzer -- I've never had a Bartendro with longer cables, so I've never really seen this problem.

nighteagle1974 commented 7 years ago

Cool... bei Arnie?

Yes the 3,3V is an Problem for UART with cables. If you have some other cables it may can work but in other situations it can't work correctly. This is why some people have problems and some not - i think.

mayhem commented 7 years ago

Ja, bei Arnie. :)

But, I do wonder if the UART calculation will make any difference -- I'd love to hear about it.

mightybigcar commented 7 years ago

OK, I'm just an embedded software guy who can't even reliably read a schematic, but the I2C discussion reminds me of something I've bumped into in previous projects with long I2C lines (that is, over about 100mm). In my experience (which may or may not apply here), for long distance, I2C gets finicky about trace/cable resistance, capacitance, noise, and so on. You may need to add an active pull up on the line, or tweak the UART configuration (if possible), or switch to shielded or twisted pair cables, or some combination of these. Anyway, I think this tends to confirm nighteagle1974's concerns about 3.3V on the longer cables - the next thing I'd do there is bring in a EE to check that and maybe add a pull-up (or try to configure stronger pull-up on the UART).

nighteagle1974 commented 7 years ago

Moin Robert,

habe wieder etwas Zeit gefunden für den Bartendro. Habe Einges verändert.. jetzt geht es wohl gut... zumindest keine Störungen mehr in der kommunikation.

Eine Frage kurz zur Software.. Ui wenn man Coktails selber erstellt was bedeutet dann die Angabe "Parts" für die einzelnen Zutaten? Sind das cl oder Prozente von der Gesamtmenge die man ja einstellen kann, bei mir 150ml Drinkgröße?

Wobei das ja auch nicht passt.. dann müssten ja immer im Ergebnis 10 parts rauskommen.. Gibt ja auch Cocktails die haben 12 parts insgesamt... Irgendwie blicke ich es nicht.

Grüsse,

Borris

Am 28.04.2017 13:42, schrieb Robert Kaye:

Ja, bei Arnie. :)

But, I do wonder if the UART calculation will make any difference -- I'd love to hear about it.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/partyrobotics/bartendro/issues/157#issuecomment-297976942, or mute the thread https://github.com/notifications/unsubscribe-auth/ABMbVoM-TeO7srdHCxU5MPTpgCmhQRGiks5r0dCQgaJpZM4CjapA.

nighteagle1974 commented 6 years ago

Moin,

sag mal kannst du mir ev. weiterhelfen bei der Frage wo in welchen Files ich suchen muss, dass ich die dosierten Mengen der Ingredients finde? Also irgendwo wird ja auch für das Trending die dosierten Mengen kalkuliert!?

Ich möchte die dosierten Mengen speichern und von einem vorher festgelegten Wert abziehen.

z.B. Alle Zutaten sind in 1L Flaschen - dann setze ich eine Variable für jede Zutat i.e. bootleSizeIDn = 1000ml; (n=1-15) Dann wenn ein Drink gemacht wird - sind ja die Dosiermengen bekannt - also ziehe ich die jeweiligen dann von der bootleSizeIDn ab.

Bei einem Shot dann einfach 1000ml -30ml = 970ml also hat bootleSizeIDn dann irgendwo nur 970ml. Damit könnte man dann eine Grenze setzen wenn bottleSize unter 100ml fällt, das dann eine Meldung kommt das die Flasche gewechselt werden muss.

Das scheint mir einfacher und zuverlässiger als der Liquid-Sensor zu sein.

In dem Forum bzw. Answer-Bereich passiert so Wochenlang nichts :-(

Grüsse,

Borris

mayhem commented 6 years ago

Das scheint mir einfacher und zuverlässiger als der Liquid-Sensor zu sein.

Das stimmt — das ist uns erst viel zu spät aufgegangen.

Leider ist diese feature etwas mehr arbeit — die datenbank muss erweitered werden und neue benuzter oberflächen müsse gebaut werden…

Hier ist aber ein guter anfang:

https://github.com/partyrobotics/bartendro/blob/master/ui/bartendro/mixer.py#L91