ElTangas / jtag2updi

UPDI programmer software for Arduino (targets Tiny AVR-0/1/2, Mega AVR-0 and AVR-DA/DB MCUs)
MIT License
318 stars 91 forks source link

JICE_io::flush() is brittle #29

Closed SpenceKonde closed 4 years ago

SpenceKonde commented 4 years ago

It only works if you know there is a transfer in progress when you call it.

If there isn't, it will clear the bit, and then wait forever for a transaction that isn't happening to set it again...

(this is one of the things I'm fixing in the version I plan to submit hopefully today - needed to get it working in the case where you don't know if there's something being transferred in order to get a debugging channel opened from a '3216 - I flush, swap serial to the alt pins, output debug info, flush again, and swap the pins back)

ElTangas commented 4 years ago

I know. But that function is only ever called when a transmission is in progress, can you elaborate on the specific situation in which it's causing infinite loops?

edit: oh ok you want to make some changes to the code that requires a more robust function.

SpenceKonde commented 4 years ago

Well, I needed a way to get debug output from it - but I didn't have any upload tool to drive it other than AVRdude, and I couldn't inject arbitrary debug information into that datastream without avrdude objecting. I figured, well just grab something with two serial ports. But I couldn't find any of my 328pb boards, the 1284p board I had kicking around was shot, and I couldn't seem to make it work on my mega2560 either (and was getting really sick of trying), so I grabbed one of the 3216 boards I have (I have literally a bin full of them, since I sell them on Tindie) - at least that one worked as well as my 328p based board. And it doesn't have two serial ports, but you can move the serial port around real easy.. I just had to make sure it wasn't in the middle of sending something when I ripped the serial port out from under it to send debug output. Hence flush. At which point I noticed that "wait, if I call that without knowing something's being sent, it'll spin forever).

My solution was:

    loop_until_bit_is_set(HOST_USART.STATUS, USART_DREIF_bp);  
    HOST_USART.STATUS=1<<USART_TXCIF_bp;  
  return HOST_USART.TXDATAL = c;  

and don't have flush clear that flag, and for the debug output I did

  void startDebug() {  
    JICE_io::flush(); //flush anything we're currently writing to host...  
    PORTMUX.CTRLB |= 0x01; //switch to the alternate pins...  
  }  
  uint8_t putDebug(char c) {    
    loop_until_bit_is_set(HOST_USART.STATUS, USART_DREIF_bp);  
    HOST_USART.STATUS = 1 << USART_TXCIF_bp;
    return HOST_USART.TXDATAL = c;  
  }  
  void endDebug() {  
    JICE_io::flush(); //flush anything we're currently writing to debug...  
    PORTMUX.CTRLB &= ~0x01; //switch to the normal pins...  
  }  

Worked pretty well (FIVE serial adapters involved - one driving jtag2updi on nano to upload new versions to the '3216. one connected to the 3216 to attempt uploading through,, one with it's RX line tied to the alt serial TX pin, and the remaining two with their rx lines tied to the TX and RX between the '3216 and the serial adapter, so I can log both sides of the conversation.) Reads work great now. but the writing still doesn't seem to behave...

I can get one write done, but the NVMCTRL goes wacky after that...

Fresh from POR, I get

NVMCTRL.STATUS, NVMCTRL.CTRLA = 00 04

Write the flash write command and load in the first 256b of data (which post mortem shows gets written correctly), but then...

NVMCTRL.STATUS, NVMCTRL.CTRLA = FF FF

At that point, whether I write a NOOP to CTRLA before or after I wait for status to clear, it never clears, and it then times out.

As an aside, I have got to find some way to deal with the way jtag2updi seems to need a reset to behave properly sometimes (particularly frusrating on the tinyAVRs with reset shared with UPDI - I just upload jtag2updi again if the process barfs midway through, because even powercycling them doesn't seem to fix it reliably - often it starts up in the bad state too)... I stared at it for hours earlier and got nowhere - unless I have a breakthrough, I'm, just going to use the WDT so when it gets into that bad state, it will reset itself. I stared at it for hours... it looks like its stuck in in JICE_io::get(), not seeing any of the characters that I can clearly see going down the serial line into it... (If it's resetting every second if not getting any serial input, who cares? There isn't anywhere near that much idle time in the programming process.... )

ElTangas commented 4 years ago

Yes, I'm aware of most of those issues. Problem is, I admit I'm quite lazy so I never got around to fix them. You are correct, JICE_io::get() is the main culprit because it doesn't have a timeout mechanism, so it will enter an infinite loop waiting for data to arrive (actually waiting for a start bit falling edge).

That's why you then need a reset of the programmer. The problem with the AVR-0/1 tinies is that power cycling them doesn't always cause a reset because they can stay alive on small amounts of energy stored in decoupling capacitors and other caps the board may have. I had that problem too... you need to short GND and VDD to discharge the caps and make sure a power cycle occurs. Maybe setting BOD to a fairly high voltage will make it easier.

Sometimes letting the UPDI line float will eventually create the needed falling edge via random noise and exit the infinite loop.

I also recall reading somewhere that the AVR-0/1 USARTs have some issues when they are turned on/off or shifting between alternate pins but I don't remember the details...

Another thing that may be of interest to you, is that avrdude actually doesn't complain about extra data being received, as long as the checksum required by the jtagice mk 2 protocol checks out. In other words, you can inject extra data in responses if you want to. For an example (that I've not documented yet), see the JTAG2::sign_on() function, if you define READ_SIB to 8 or 16, extra info will be appended to the sign-on response, that you can read if you set avrdude to verbosity lvl 4 (-vvvv). This can be done for any response, as long as the packet.size_word[0] variable is updated with the new size of the message.

SpenceKonde commented 4 years ago

At this point I think I've nearly got a version ready that detects (via timeout) and recovers from both the obvious types of communication errors I could find that seemed to be able to cause a hang... It should be ready in the next few days.... also have most of the work done on a better debug output channel

ElTangas commented 4 years ago

Function is fixed in new version, closing.