keirf / greaseweazle

Tools for accessing a floppy drive at the raw flux level
The Unlicense
949 stars 93 forks source link

F1 using APM32F103: Update fails with v0.21+ #72

Closed M1kerochip closed 3 years ago

M1kerochip commented 3 years ago

Hi,

I've done some testing, and it looks like the last version of GW that writes a whole disk for me, is 0.21. Tried multiple drives, multiple STM32F1 "bluepill" devices. (Generic, STM32F103 from Robotdyn, APM32F103 from Robotdyn)

At random stages, but on earlier tracks usually, 0.22 fails to complete writing a disk image.

I've tried writing the same disk image, from 0.17-0.21 and it wrote 5 times in a row. (and multiple times, just to be sure, on 0.21)

With 0.22, I randomly get:

First time: F:\gw\Greaseweazle-v0.22>gw write D:\Programming\Projects\sdbox-master\IDE-Setup_adf.scp Writing Track 11.1...Command Failed: WriteFlux: Disk is Write Protected

the second time: F:\gw\Greaseweazle-v0.22>gw write D:\Programming\Projects\sdbox-master\IDE-Setup_adf.scp Writing Track 12.0...Command Failed: WriteFlux: Disk is Write Protected

Third time: F:\gw\Greaseweazle-v0.22>gw write D:\Programming\Projects\sdbox-master\IDE-Setup_adf.scp Writing Track 22.1...Command Failed: WriteFlux: Disk is Write Protected

So, trying to write this disk, you can see, first it failed on track 11, then 12, then 22. On the fourth time, it was an earlier track. The farthest I got writing was the third attempt.

It makes no difference what physical disk/disk image combination I try, results are similar each time.

For 0.24, before it tries to write track 0.0, I get the write protect message.

This is the disk image I tried, https://modelrail.otenko.com/assets/amiga500/IDE-Setup.adf It makes no difference if it's an .adf, .hfe or .scp, results are the same.

keirf commented 3 years ago

Are you using dupont wires or a greaseweazle adapter pcb for connecting the blue pill boards?

M1kerochip commented 3 years ago

Sorry, I should have said: It's a PCB with resistors soldered on.

And I've reflashed each version from 0.16-0.24 to test.

keirf commented 3 years ago

I would still suspect bad or loose cable despite it seeming to be version related. But I will test 0.24 on my bluepill later.

M1kerochip commented 3 years ago

I've tried different PCBs, different cables, different STM32F1 boards.

With each combination, each time, the 0.22-0.24 versions fail to write.

Re-flash 0.20 and the same setup writes perfectly.

I've tried making a new cable, only 6cm, and it too works perfectly on all versions up to 0.21

I've tried writing a disk image more than 30+ times in the last day or so, the only consistent fails are the 0.22+ versions.

Atm, I have 2x PCBs, and 2x F1 boards. If I swap them, the 0.24 board fails, no matter which cable/drive it's connected to.

It's unlikely to be a cable issue, I think.

keirf commented 3 years ago

Dodgy write-protect switch in the drive perhaps? That is directly tested at the start of each track write, and that is the only place which returns ACK_WRPROT (which gets the write-protected error string in the gw application).

foo

keirf commented 3 years ago

Also maybe ask more widely on the FB group. There are a lot of F1 users, and some proportion will certainly have updated to v0.22+ and attempted to write a disk.

Maybe also check pull-up resistance on the adapter board for floppy pin 26.

M1kerochip commented 3 years ago

I'm not a FB user unfortunately.

I don't think it's a drive issue, all 8 of my drives (3x PC, one new old stock, 2x Amiga, 3x Atari ST) behave exactly the same.

All of my F1 boards are either from RobotDYN, (ATM32F1 or APM32F1) or one other Chinese manufacturer, so, I definitely don't have a lot of variety there though.

I'm writing on Windows, too, if that makes a difference.

I'll try a dupont connection just for the hell of it.

keirf commented 3 years ago

Just tested v0.24 on Windows 10 and it works okay for me. I have posted on the FB group about this, linking to this issue ticket.

keirf commented 3 years ago

Which country are you in?

M1kerochip commented 3 years ago

Ireland!

keirf commented 3 years ago

Well I can only suggest send to me for test. I'm UK. Others on the FB group are going to test also.

M1kerochip commented 3 years ago

Good Stuff.

I'm out for a few hours, but later I'll try the duponts, and a different USB cable, too, just in case, and I'll dig out my laptop, and try on that, as well.

ourIThome commented 3 years ago

With the randomness in progress of failing ... Can I ask what spec your computer is? How busy is your computer.?

M1kerochip commented 3 years ago

Sure: Win10 Pro, 16GB, i7-4790k, Samsung EVO 860 Pro.

Not a new machine, but, it isn't even vaguely stressed by this.

Most of the time, the CPU usage is at a few percent. The machine isn't doing anything else. And, like I said, a quick flash back to 0.20, and it writes fine.

It's not exactly random, either, 90% of the time it fails on track 10.0-12.1. 100% of the time, on 0.24, it fails on track 0.0

And, it's never failed to read a disk. (1,000+ disk dumps so far)

I've tried 4-5 more random disks, all failed to write on 0.22. Flash back to 0.20, all wrote fine. (I've tried writing the .upd file, and flashing the .hex with an ST-Link, neither of which seemed to make any difference)

gw bandwidth says it needs ~ 8.2mbps.

               Min.   /   Mean   /   Max.   

Write Bandwidth: 9.136 / 9.206 / 9.230 Mbps Read Bandwidth: 9.038 / 9.553 / 9.734 Mbps

Estimated Consistent Min. Bandwidth: 8.134 Mbps -> Max. Flux Rate: 0.915 Msamples/sec -> Min. Ave. Flux: 1.093 us

ourIThome commented 3 years ago

I know with my slow laptop, if I try to capture to scp ... Then open outlook, chrome it stops the capture. So think that rules that out.

Someone asked about your python version ... Can I assume your using the following file in its own clean folder?

https://github.com/keirf/Greaseweazle/releases/download/v0.24/Greaseweazle-v0.24-win.zip ... And that your running gw.exe?

I'll see if I can find a blank disk to try and write to tomorrow.

Oh and what does GW.exe info show?

ourIThome commented 3 years ago

I had another thought ...

I believe GW has an erase option ... Are you able to complete this fine?

And ... Is (which doesn't make sense to me) the floppy disk a high density disk (two square )... Where the one without the tab to write enable is missing from the opposite side? If so ... Can you cover the hole up?

Have you tried writing a normal PC image to disk?

keirf commented 3 years ago

Could you attach a photo of your setup?

M1kerochip commented 3 years ago

Sure: https://i.imgur.com/KQXN08y.jpg

and F1s https://i.imgur.com/I9BIiJ4.jpg

If you want better photos, let me know.

I made a floppy female to female power lead, and just swapped it with duponts, just in case.

One thing I have noticed though: If you recall, earlier, I said the gw.exe update wasn't working for me: up to v0.21 the update command and .upd file work fine, and from 0.22-0.24, that doesn't work. I have to manually flash the .hex to upgrade to those versions.

Is it something to do with the version of Python bundled?? Has that changed?

Can I install python manually, the required libraries, and then run the python scripts under windows still?

M1kerochip commented 3 years ago

They're not amazing photos, I guess. I can get something better, if you like. (The white piece of tape on the APM32 is to dampen the lights)

I've tried different usb calbes too, including a mini cable, and a standard 1.8m one (Both work fine with other usb devices)

keirf commented 3 years ago

Okay the update problem is strange and unusual too. What exactly happens? You should always be able to gw update with any tools version, and the firmware will be updated to the version matching the tools.

M1kerochip commented 3 years ago

Nothing.

If I run either gw.exe update, or gw.exe update it just hangs.

I've left it for a few mins, and the screen stays sitting, waiting, nothing extra is output.

If I end task the gw.exe, I get an error about needing some buffers:

** FATAL ERROR: unpack requires a buffer of 2 bytes

But, if not, it just sits there forever.

keirf commented 3 years ago

Can you gw --bt update and bail out of the hang, and report the back trace you should see printed out.

M1kerochip commented 3 years ago

Doesn't seem to do anything:

F:\gw\Greaseweazle-v0.24>gw --bt update Updating Main Firmware to v0.24...

F:\gw\Greaseweazle-v0.24>

Takes a long time to end the gw.exe task, too. (where usually it's within a second)

(Same from cmd.exe or Windows Terminal)

keirf commented 3 years ago

Okay well it is interesting that v0.22+ is precisely what fails for you in two ways: Writes and Updates.

If you run an old bootloader, by flashing with HEX to say v0.16, can you update the main firmware to v0.24 using the gw update method? It is weird if not as neither the bootloader and update script have not really changed since v0.21 or earlier (and of course you'd be running a bootloader [v0.16] that you know works!).

You can indeed run the Python scripts in Windows, just install Python 3 and the required libraries. You may get a warning about running "setup.sh" for accelerated routines, but you can ignore that for testing.

EDIT: Also did you try another host (your laptop) yet?

M1kerochip commented 3 years ago

If you run an old bootloader, by flashing with HEX to say v0.16, can you update the main firmware to v0.24 using the gw update method? It is weird if not as neither the bootloader and update script have not really changed since v0.21 or earlier (and of course you'd be running a bootloader [v0.16] that you know works!).

yes, that's exactly what I did, but, I did it with the 0.17 gw.exe version.

You can indeed run the Python scripts in Windows, just install Python 3 and the required libraries. You may get a warning about running "setup.sh" for accelerated routines, but you can ignore that for testing.

Same. 0.16 firmware, 0.16 tools, 0.24 update, Python 3.8.1 (which is what I used before) I get:

D:\Programming\Python38>python F:\gw\Software\Greaseweazle\Greaseweazle-v0.16\gw.py update F:\gw\Software\Greaseweazle\Greaseweazle-v0.24\Greaseweazle-v0.24.upd

Bootloader v0.16 [F1], Host Tools v0.16 FATAL ERROR: unpack requires a buffer of 2 bytes

Device remains in update mode, no crashing, no lockup, immediately displays the error, and returns to command line.

EDIT: Also did you try another host (your laptop) yet?

Yes, flashed 0.24 .hex, and tried. Exactly the same. Failed to write 0.0 - write protected.

keirf commented 3 years ago

You can't update to later firmware from earlier tools. In fact the later tools are explicitly locked to only update to exact version-matched firmware. So, to update to v0.24 firmware, run gw update from v0.24 tools. (EDIT: This is all checked properly in more recent tools, it was a mistake to make the earlier update scripts a bit fast and loose and fail in non-obvious ways).

I assume you are running the v0.24 tools with v0.24 firmware? I think you have to, actually, at least I did version-lock tools to firmware from the very beginning...

M1kerochip commented 3 years ago

OK, update using the 0.16 firmware, with 0.24 tools, and 0.24 .upd works fine.

D:\Programming\Python39>python F:\gw\Software\Greaseweazle\Greaseweazle-v0.24\gw.py update F:\gw\Software\Greaseweazle\Greaseweazle-v0.24\Greaseweazle-v0.24.upd * WARNING: Optimised data routines not found: Run scripts/setup.sh Updating Main Firmware to v0.24... Done. Disconnect Greaseweazle and remove the Programming Jumper.

OK, so, 0.16 firmware, 0.24 tools and host: D:\Programming\Python39>python F:\gw\Software\Greaseweazle\Greaseweazle-v0.24\gw.py write D:\Programming\Projects\sdbox-master\IDE-Setup.adf *** WARNING: Optimised data routines not found: Run scripts/setup.sh Writing c=0-81:h=0-1 Writing Track 1.1...Command Failed: WriteFlux: Disk is Write Protected

It tries to write 0.0 and 0.1 and 1.0 and fails on 1.1

That's the first time a 0.24 gw didn't fail on 0.0

keirf commented 3 years ago

Please can you grab a backtrace: --bt after gw.py on command line.

eg python F:\gw\Software\Greaseweazle\Greaseweazle-v0.24\gw.py --bt write D:\Programming\Projects\sdbox-master\IDE-Setup.adf

M1kerochip commented 3 years ago
D:\Programming\Python39>python F:\gw\Software\Greaseweazle\Greaseweazle-v0.24\gw.py --bt write D:\Programming\Projects\sdbox-master\IDE-Setup.adf
*** WARNING: Optimised data routines not found: Run scripts/setup.sh
Writing c=0-81:h=0-1
Writing Track 2.0...Command Failed: WriteFlux: Disk is Write Protected

Nothing after that.

keirf commented 3 years ago

Oh I swallow the error. That's annoying. Can you put a raise on line 93 of scripts/greaseweazle/tools/write.py? It would go aligned with and directly underneath the print() statement. I don't know if you're handy with Python at all. If not I can roll you a build to try...

EDIT: Sorry I was looking at v0.16. That should be line 187 of write.py

M1kerochip commented 3 years ago

I've never used python! But putting raise on that line gives me:

TabError: inconsistent use of tabs and spaces in indentation

keirf commented 3 years ago

Ah no worries. Let me make a branch and I can send you test versions to try out.

keirf commented 3 years ago

Okay, can you download the artifact from this build run: https://github.com/keirf/Greaseweazle/actions/runs/507423735

  1. You'll need to be logged in to github
  2. Link is then at the bottom of the page
  3. The download is a zip containing three zips. You want the inner zip without -win or -debug in the name. That's a plain Python-source distribution.
  4. Unzip it, gw update to the firmware within, and try your gw write command. No need for --bt as that's hardcoded here.

Let me know what backtrace you get :) In fact send me everything from your command-line invocation down. That works well.

M1kerochip commented 3 years ago

D:\Programming\Python39>python F:\gw\Software\Greaseweazle\Greaseweazle-fa0cd7d\gw.py write D:\Programming\Projects\sdbox-master\IDE-Setup.adf

gives

TEST/PRE-RELEASE: commit fa0cd7d508913f3cb939e3db0e9db83e05f0558e Use these tools and firmware ONLY for test and development!! WARNING: Optimised data routines not found: Run scripts/setup.sh Writing c=0-81:h=0-1 Writing Track 1.1...Command Failed: WriteFlux: Unknown Error (20) Traceback (most recent call last): File "F:\gw\Software\Greaseweazle\Greaseweazle-fa0cd7d\gw.py", line 11, in import gw File "F:\gw\Software\Greaseweazle\Greaseweazle-fa0cd7d\scripts\gw.py", line 75, in res = main(argv) File "F:\gw\Software\Greaseweazle\Greaseweazle-fa0cd7d\scripts\greaseweazle\tools\write.py", line 184, in main util.with_drive_selected(write_from_image, usb, args, image) File "F:\gw\Software\Greaseweazle\Greaseweazle-fa0cd7d\scripts\greaseweazle\tools\util.py", line 184, in with_drive_selected fn(usb, args, _args, _kwargs) File "F:\gw\Software\Greaseweazle\Greaseweazle-fa0cd7d\scripts\greaseweazle\tools\write.py", line 84, in write_from_image usb.write_track(flux_list = flux_list, File "F:\gw\Software\Greaseweazle\Greaseweazle-fa0cd7d\scripts\greaseweazle\usb.py", line 431, in write_track raise error File "F:\gw\Software\Greaseweazle\Greaseweazle-fa0cd7d\scripts\greaseweazle\usb.py", line 420, in write_track self._send_cmd(struct.pack("4B", Cmd.WriteFlux, 4, File "F:\gw\Software\Greaseweazle\Greaseweazle-fa0cd7d\scripts\greaseweazle\usb.py", line 205, in _send_cmd raise CmdError(cmd, r) greaseweazle.usb.CmdError: WriteFlux: Unknown Error (20)

keirf commented 3 years ago

Hmm well, a write command was sent to the firmware, and it really did detect that the write-protect line (floppy pin 28; BluePill pin B8) is asserted. I changed the error code to that one place in the firmware to 20, and that is being reported through the tools.

You say it's on multiple drives and multiple disks and multiple ribbon cables, which ought to discount the possibility of a dodgy switch, disk or cable. Maybe we should take a closer look at your adapter(s)?

This does smell like a floating input pin getting randomly pulled low.

keirf commented 3 years ago

Can you do a real close up of your adapter, especially where the resistors attach. Did you solder them up yourself?

M1kerochip commented 3 years ago

OK, I think you can close this.

Putting the F1 that was connected via the duponts back in the PCB, it immediately gives a "Disk is Write Protected" error, where it wrote to track 75 earlier, with the duponts (but the Epson drive)

But. I flashed it back to 0.20. And it wrote the disk no problem. So. Whatever the reason, 0.22+ is failing on the write, where 0.20 isn't.

I guess, I'll remove the resistors, and clean the PCB, and put new ones on. They're 4x 0805 1K 1% resistors. (I didn't measure the resistance before I installed them, I just assumed they were fine)

Just touching one side of the solder to the other with the multimeter, it shows a reading of 996 ohms on all four resistors, but, I'm not sure which pins they connect to on the PCB to test properly.

I have more PCBs but I've misplaced them! Otherwise I'd just try another one.

keirf commented 3 years ago

Regarding your adapter, those solder joints are not good. Can you see on some of the resistors, on one side, the solder has not spread onto the PCB pad? That is not a proper electrical connection: It will look connected some of the time, but depending how the PCB flexes or heats, the junction will show high resistance. You do not need a new adapter: You need to reflow those joints with flux and touch an iron to them. The flux will cause the solder to flow nicely.

To your other errors: 1."Failed to write Track 75.1": This is a media failure where that ADF track has failed to verify after write, after three attempts. The error really needs to be clarified (eg Verify error on Track 75.1) and I will be doing that for next release.

  1. Haven't seen MemoryError before. Yes there will be possibly 100s of MBs held in memory before writing the image file. But modern systems should have gigabytes so not a problem. :) Is your host system here memory constrained?

EDIT: Actually the MemoryError is the most interesting thing in the ticket now!

keirf commented 3 years ago

Regarding the bad electrical connection, I suspect that the resistor for floppy pin 28 doesn't connect at all, and that line floats when not asserted by the drive. In this case the line can read any value at any time. It is pure chance that it works with earlier firmwares and less well with later ones.

M1kerochip commented 3 years ago

AH HA!

OK.

After writing down all my testing: I finally found the problem(s).

It's this board, and the resistor(s). https://robotdyn.com/black-pill-apm32f103cb-128kb-flash-20kb-sram-stm32-compatible-arm-cortexr-m3-mcu-mini-board.html

That particular "blackpill" has an APM32F103 onboard. And runs at 96Mhz. If that makes a difference.

I never noticed, since it's almost identical to the other STM32F103 I have from RobotDYN.

It reads perfectly, but, since 0.22+ it won't write.

Resoldering the four resistors: Now each of the STM32F103 boards reads and writes correctly, either on duponts, or on the PCB.

However! The APM32F103 fails, randomly, to write on the PCB now. It sometimes fails on track 0.1 but mostly on 0.0. It fails on higher track numbers only connected via duponts, but it always fails. Reflashing 0.20 on it, and it writes ok, on the PCB or direct connect.

And of course, the first board I tried, connected via the duponts, was the APM32 board. And connected the STM32 board on firmware 0.20, so. It's only making notes, and testing properly, that I found it.

SO. Mystery solved, more or less.

The MemoryError: I've been able to recreate that, too. I opened a webpage and it ate up huge amount of ram, crashed, didn't release the ram, and continued to chew ram. I never noticed until later. When I eventually went back through my steps, yes, I'd run out of memory, I'm guessing. (I killed chrome at one stage, and it was after the memory error, so I'm guessing that's the cause) (Don't browse and use Chrome kids!)

If I use enough ram, I can get the memory error. (I have to open a hundred+ web pages, but it happens eventually.)

I can post you one of those APM32 boards, if you like. I think I have one or two of them here unused. They're about €2, which is why I bought some on my last order, to test them out. They worked fine, so I left them in the PCB.

keirf commented 3 years ago

Yes I'd be happy to take a look at one. My email is keir.xen@gmail.com so drop me a message to get my mailing address. Cheers!

keirf commented 3 years ago

So to be sure: The APM32F103 still is failing specifically with the "Disk is Write Protected" error?

EDIT: If yes, take a very close look at the soldering of your header pins to the Black Pill. Especially at pin B8. Does it have a nice fillet down onto the board's pad? Or does it look like a ball/blob of solder on the header pin?

M1kerochip commented 3 years ago

Yeah. The APM32, and only that board, is failing with that specific error. It also fails to update to 0.22+, with the .upd file. (So, just the bare board connected via the USB to the PC)

B8 looks fine, I think.

This particular APM32 I bought already soldered, and came with a QA sticker on it.

I can open another one, solder on the legs, and see if the new one is the same?

keirf commented 3 years ago

Yes I would. And if so you still have one to send me.

M1kerochip commented 3 years ago

OK! Yeah, can do. A job for tomorrow though, I think.

M1kerochip commented 3 years ago

OK! Final update:

I soldered up a new APM32, and it worked fine on reading/writing! So! Good news there. It still, however, fails to run the .upd file.

I reflowed the solder on B8 on the pre-soldered one i bought: And it now successfully writes properly too! (But still fails on the update) SO!

Good news, I guess. I still have one APM32, which I can post.

Photo of the STM32F103 on the right, and APM32F103 on the left. https://i.imgur.com/Szl780E.jpg (Easy to mix up!) the STM has a proper indent on the chip. https://i.imgur.com/besftHM.jpg

The soldered/non soldered APM32 packages: https://i.imgur.com/okisHi2.jpg

Somewhere along the way, my proper STM32F103 RobotDYN board has died. The PC no longer picks it up.

In conclusion: All good, and I'll post the APM towards the end of the week :)

keirf commented 3 years ago

So you say it used to run the update but now doesn't? Or has it never worked on APM32?

You jumper SCK-GND, and the LED should flash and you should be in bootloader mode. But that doesn't happen?

M1kerochip commented 3 years ago

Yup.

The APM32 ran the .upd file up to 0.20, and on 0.21+ it just hangs instead of flashing.

It goes into update mode, you run the gw update .upd file, the light stops blinking, and the gw.exe hangs.

So, the initial problem I had can be broken down in to 3 problems:

1) I can't update the APM32F103 using the .upd file on 0.21+

2) APM32 failed to write connected via duponts,

3) APM32 failed to write connected via the PCB.

problem 2) was solved reflowing the solder on B8.

problem 3) was solved resoldering the resistors on the PCB.

Issue 1) still remains.

keirf commented 3 years ago

Ok, but you can update via HEX file on the APM32 still?

It sounds like the APM32 crashes, if the LED goes out... Maybe I can tweak the update process to still work with those chips.

keirf commented 3 years ago

I can safely say nothing has changed between v0.20 and v0.21 which should affect the update bootloader in any way. Except the main firmware has got fractionally bigger. The F1 update payload in v0.20 is 9660 bytes. In v0.21 it is 9760 bytes.

Do you know what version bootloader you have in place when you do these tests (you can also check that with gw info while in bootloader mode).