edward0429 / arduino

Automatically exported from code.google.com/p/arduino
0 stars 0 forks source link

Burning bootloader with AVR dragon using either IDE or Makefiles bricks AVR. #650

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
When attempting to burn a bootloader using either the IDE Burn Bootloader option
or any of the Makefiles down in the hardware/arduino/bootloaders/xxx
the fuses will be burned and the flash erased but the bootloader is not burned
resulting in a non operational chip.

This behavior is easily reproducible on both windows and Ubuntu linux.
It is probably an issue other OSs and
potentially an issue with other USB ISP programmers.

This is not a permissions issue or a libusb installation/configuration issue.

This is due to an interaction between how the burning tool
(IDE or Makefile) is using avrdude, avrdude itself, libusb and the OS.

The crux of the problem is that two avrdude commands are being used to burn
a bootloader. One command to set the fuses and erase the flash and another
to burn the bootloader to the flash.
The 2nd avrdude command is failing because it can't locate the USB device.

Digging deeper, the problem is that when the avrdude command finishes
using the USB device, it resets the USB device.
This reset cause the USB device to have to go
through enumeration again. This enumeration takes time. If another avrdude
command runs before the enumeration is complete, it will not see the USB
device on the USB bus.

The reset cannot be removed because for some reason if the USB device is
not reset, it will fail to communicate properly. There is probably some
other issue either in the OS or libusb itself that is causing this issue.

In my view this really is an avrdude command issue or at least an
issue that avrdude could resolve.
I went so far as to enter a bug for it on savannah:
https://savannah.nongnu.org/bugs/?34339
So far, it was not well received as the avrdude maintainer
considers this to be an OS issue rather than an avrdude issue.

There are some options to work around this.
1. add a blind delay between the two avrdude commands (in IDE and Makefiles)
2. Update the avrdude code to poll for the devices for a few seconds
   rather than just look one time.
3. Alter the IDE and Makefiles to use a single avrdude command rather than two.

#1)
Currently, it looks like there already is a 1 second blind delay in the IDE
code down in AvrdudeUploader.java in the burnBootloader() function.
The problem is that 1 second is not long enough.
From my experimentation on some of my machines, on Ubuntu 10.10 
2 seconds seems to work. While I didn't fully test this
on Windows to find the exact time needed, 
I can say that on my older 1.6Ghz Dell, it takes longer
than 2 seconds for the enumeration to complete.
The problem with these blind delays is that it slows down things for everyone
since the full wait is always done and there is no way to really know what
the best value to pick is. 

#2)
It is possible to modify the avrdude code to be smarter to try to
poll "a while" looking for USB devices rather than just look one time.
I modified the code in usb_libusb.c to do this (it is only 7 lines of code)
and placed a patch for it in the savannah bug noted above.
This update works great, it polls the USB every 1/10 of second until
it finds the desired USB device - using a maximum delay of 3 seconds 
(Probably should up this to 5 seconds.)
The nice thing is it will now only delay as long
as needed and only the full time if truely there is a problem. On my Ubuntu 
10.10
machine the delay is just over 1 second.
But given the initial reception for updating avrdude to compensate
for the USB device behavior, this type of change probabaly will not be
accepted or at least may be well beyond the uno 1.0 release timeframe.

#3)
Perhaps the best way to deal with the issue is to avoid it
completely by changing things to burn the fuses and the bootloader
in a single avrdude command.
It is a vary small change to modify the java code and the Makefiles
to use a single avrdude comman rather than two.
Not only will doing this avoid the issue, but it will speed up
the bootloader burning process by avoiding the overhead of a blind
delay and the overhead of a second avrdude command.
It may also help in some cases with AutoReset issues on non USB
devices as there will no longer be a port close and re-open between 
setting the fuses and burning the bootloade since it would all
now be done with a single call to avrdude.

I modified the optiboot Makefile in westfw's latest optiboot working tree
to do this as an example.
(See attachment). It was a very small change and this change
can and probably should be propagated through all the other Makefiles as well.
It is very simple and impacts no other code.

The JAVA code can easily do the same. 
So down in burnBootLoader() rather than call avrdude() and then
Thread.sleep(1000) 
Just build up a single List for both the fuses and the bootloader

I've attached an untested attempt at what that could look like.

Original issue reported on code.google.com by bperry...@gmail.com on 22 Sep 2011 at 8:33

Attachments:

GoogleCodeExporter commented 9 years ago
I don't know if you CAN do the burn in one step.  the current process changes 
the fuses, burns the bootloader, and then changes the fuses again (just the 
lock bits?)
Does AVRDUDE preserve the ordering of program memory commands?  (note that the 
makefile in the bootloader source directory uses the same process.)

Original comment by wes...@gmail.com on 23 Sep 2011 at 1:29

GoogleCodeExporter commented 9 years ago
According to the avrdude usage message:
-U <memtype>:r|w|v:<filename>[:format]
                             Memory operation specification.
                             Multiple -U options are allowed, each request
                             is performed in the order specified.

While I can't be sure if the initial lock fuse value was written as well
as the final lock fuse value vs just the last one, but does it really matter?

-e should reset all the lock bits so isn't writing a value of 0x3f a NOP?
(upper 2 bits are ignored for atmega328p)
And then it is the final lock bits value that matters.

I see a value of FE reported for the final lock bits by the fusebytes sketch
(0x2e as reported by the dragon)
whether I use the two separate avrdude commands with a delay of a few seconds 
between
them or if it is run as single avrdude command as in the attached makefile 
above.

OR whether I use the -U lock:w:0x3f:m in the initial fuse setting or leave it 
out
completely with either two commands or the single command.

Why set the lock bits in two steps? 
Especially when the initial setting is the same as using the -e option which
is being used.
Is there some kind of workaround for some devices in play here?

Original comment by bperry...@gmail.com on 23 Sep 2011 at 3:48

GoogleCodeExporter commented 9 years ago
Ok so I have verified that if multiple fuse/lock updates are requested that 
avrdude will process all of them in the order they are requested.
So for example:
avrdude  -c dragon_isp -P usb -p atmega328p -e -u -U lock:w:0x3f:m -U 
lock:w:0x2f:m

will work where: 
avrdude  -c dragon_isp -P usb -p atmega328p -e -u -U lock:w:0x2f:m -U 
lock:w:0x3f:m

will fail because you can't unclear bits.

But I am still wondering why do the initial lock bits set to 0x3f ?

Original comment by bperry...@gmail.com on 23 Sep 2011 at 8:19

GoogleCodeExporter commented 9 years ago
Hi,
I can confirm your problem also exists with version : arduino-1.0-RC1 (Ubuntu 
11.04)

I did some testing and as a workaround raising the timeout between the two 
avrdude commands from 1 to 2 seconds solved the problem for the time being. 

But you are right, burning the bootloader can be done in a single call to 
avrdude. 

At least it works with the AVR-Dragon and the AVRIspMKII. 

Attached a file that shows the single call upload command that works on my 
machine.
Eberhard

Original comment by e.fa...@wayoda.org on 23 Sep 2011 at 3:24

Attachments:

GoogleCodeExporter commented 9 years ago
The file from mthe previous comment I made was for testing the AVRispmkII.
For the dragon the programmer has to be specified as '-cdragon_isp'

Eberhard

PLEASE !!!!!
Move this issues list over to github where we can EDIT comments, and don't have 
to write a new one all the time.

Original comment by e.fa...@wayoda.org on 23 Sep 2011 at 3:32

GoogleCodeExporter commented 9 years ago
Yes that is essentially doing the same thing as the modified optiboot Makefile  
attachment I attached in the initial post.
I also have an updated atmega makefile as I have a tested one of those as well
but to recompile the atmega bootloader, the atmega bootloader code itself
also needs a small patch to compile with latest
gcc tools. - which I also have.

================================================================================
==

As mentioned in the first post, there are essentially 3 options that can be 
used to work around this issue:
1) update the Makefiles & Java IDE code to do blind sleeps between the two 
commands
2) update the Makefiles & Java IDE code to do a single command vs two commands.
3) update the avrdude code to poll for the USB devices.

All of them work (at least with Dragon on Ubuntu) and are very small 
modifications.

One thing I didn't realize is that the Arduino IDE for linux already ships its 
own avrdude
and does not use the avrdude on the system. I had to overwrite the Arduino 
version
of avrdude in {installdir}hardware/tools
So given Arduino already ships its own avrdude, it might be an option
to ship a patched avrdude that eliminates the back to back problem.
(It is only 7 lines of code to add this functionality)

================================================================================
===

Here is more background information from an arduino optiboot thread that I am 
duplicating here for completeness (trying to keep all the information here)

I tracked things down further and found that avrdude is using the libusb 0.1 
API vs the current
1.0 API. It is using a 0.1 to 1.0 mapping layer. The older API isn't an issue,
the real gotchya is that libusb uses usbfs and makes ioctl calls into it.
The linux usbfs documentation says that there are kernel issues
with the usb reset ioctl and recommends not using it since the device state is 
not restored.
They do however return a status indicating this and
the libusb documentation recommends re-scanning for your device if you get this 
particular error status.

The issue is that when an application disconnects from the USB device, the 
device is unavailable
until it re-enumerates.

As far as this being solved in libusb, it is kind of tough because the 
application running
has no knowledge of this interrium state created by a previous application
and you don't really want hold up the previous
applications device close and termination on the happenstance that another 
application will need to talk
to the same device.

My guess is that the libusb guys will say that there are still outstanding 
kernel issues in usbfs
related to usb_reset() and that there is enough status information and enough 
existing API calls to
poll for the device at the application layer
if you want/need a more robust solution because the polling really needs to be 
handled
on the front end device open and device scan - which is under control of the 
application
not libusb.

So that would push any solution back into avrdude or even higher into
the IDE or Makefiles.

Original comment by bperry...@gmail.com on 23 Sep 2011 at 6:40