Ralim / IronOS

Open Source Soldering Iron firmware
https://ralim.github.io/IronOS/
GNU General Public License v3.0
7.04k stars 701 forks source link

Pinecil v2 doesn't work after updating to v2.21 #1661

Closed sorgelig closed 1 year ago

sorgelig commented 1 year ago

Describe the bug After flashing by blisp to v2.21 my pinecil v2 looks completely dead.

To Reproduce

  1. Flash v2.21 (Pinecilv2_EN.bin)
  2. iron looks dead, no display on screen

Expected behavior normal work

Details of your device:

Additional context v2.21 it seems doesn't work on my Pinecil v2. Thankfully, flashing v2.20 brings iron back to life.

sorgelig commented 1 year ago

according to datasheet, baudrate on FT232 is calculated as: bps = 3000000/(n+x) where min n = 2, min x = 0. So max bitrate is 3000000/2 = 1500000bps. Anyway, i've tried - no activity on UART. May be i'm doing something wrong or 2mbps is too much for FT232.

sorgelig commented 1 year ago

Which crazy guy decided such insane UART speed? 2mbps to output text? 115200 is standard industrial UART speed..

River-Mochi commented 1 year ago

There's information for the Pinecil Breakout board in the [pine64 wiki]

:) I added that photo to wiki recently since there are often questions on how to set up UART & Break out board. photo taken in my back yard.
not an expert; just followed Ralim's directions in the past to help with debugging and made photo to help people. There is another method too if you don't own a Pinecil Breakout board that I can upload if people need it, then you buy a USB-C M-F passthrough pcb from ebay or aliexpress and solder a header pins to GND, BC2 and BC3 (typically cost $3). you still need a siPeed or NuSOM or some other UART type board that can do high baud rate of 2000000.

gamelaster commented 1 year ago

@sorgelig this baudrate is used by Bouffalo as default baudrate. It is quite trend nowadays to use such high baud rates. If you have really FT232 (withour RL or H on the end), it is likely that it might not be able to support that high baudrate.

River-Mochi commented 1 year ago

might not be able to support that high baudrate.

I can confirm this what Gamiee is saying, some people in Pinecil chat can't use their low buad rate uart tester boards. I can confirm that siPeed ($5 aliexpress) and nuSOM brand ($13, amazon) both worked when I tested them. I bought 3 different UART boards from Amazon, and 2 did not work, I had to return them.

sorgelig commented 1 year ago

It is quite trend nowadays to use such high baud rates.

i don't agree. Standard rate 115200 is used almost everywhere. Probably you deal only with same company or group of ppl who tend to use such unreasonably high speed.

River-Mochi commented 1 year ago

If you don't own a Pinecil breakout board. could use B3, B2, and B1 (GND) on this generic M-F passthrough board (it doesn't have all features of Pinecil Board, but can do UART). image

For people who don't have the nice Pinecil break out board, it's possible to do UART with these cheap PCB M-F. Connect one USB-C side to Pinecil and other USB-C to charger cable and charger. Note, one person had to repair the one they got from Ebay bc trace was defective. his VCC was bridged to BC3 so he had to fix that. check with meter on these cheap PCBs that VCC is not connected to any other lines like the BC2/BC3 needed.

image

From Lupyuen's article on BL706

Bouffalo defined the baud rate on their BL706 chips https://lupyuen.github.io/articles/bl706 image

SiPeed brand is not required; any UART board will work, I got this bc it was $5 on aliexpress compared to ones that were $12-$15 on amazon and can confirm it works. image

River-Mochi commented 1 year ago

I don't think i want to open iron, i have bad experience in opening TS100.

if you do want to open it there is EASY trick to open Pinecil without breaking plastic clips. https://www.youtube.com/watch?v=aK01V5DrrVk

opening is not needed to get UART debug messages, just need passthrough USB-C boards like above generic one (or Pinecil break out board) and a free program like PUTTY, then watch screen while using pinecil. https://www.putty.org/

PSandro commented 1 year ago

I don't think i want to open iron, i have bad experience in opening TS100.

anyone who wants EASY trick to open Pinecil without breaking plastic clips. https://www.youtube.com/watch?v=aK01V5DrrVk

opening is not needed to get UART debug messages, just need PCB boards like above and a free program like PUTTY, then watch screen while using pinecil. https://www.putty.org/

Let's not get off-topic :) I think this info is greatly appreciated in the wiki. I hope we can keep the comments here on a minimum so that everybody has an overview.

So when I receive the Breakout Bord I'll start testing to find out where the pinecil gets stuck during boot and will share the uart log here (only got an FT232RL, fingers crossed that it works; otherwise I'll try and build a firmware with lower baudrate).

@River-Mochi you seem to have the hardware needed to get the UART logs of your pinecilv2, can you share them for the blackscreen/chip crash scenario?

chiefos commented 1 year ago

@Ralim, may it be acceptable workaround to resolve the issue by adding in FW some delay to initialize BT only after Pinecil is already powered on?

jonatan-ivanov commented 1 year ago

I'm not really on top of all of these comments but would it make sense to:

  1. Add a warning with the link to this issue to the release notes so people who check it before flashing the new version can decide if they want to go ahead or wait?
  2. Releasing 2.21.1 or 2.22 (depending on versioning strategy) where BT is disabled by default (assuming that's the issue) and also making a comment in the new Release Notes that it might not work with the link of this issue?
PSandro commented 1 year ago

So I did some testing with building firmware, commenting some lines of code out and found the line of code where the blackscreens start happening: https://github.com/Ralim/IronOS/blob/da18b9b60f6b4f4f9b23a03f6c5366e79d1a978b/source/Core/BSP/Pinecilv2/ble.c#L27

basically when I comment out this line and the following hci_driver_init(); then there are no blackscreens/crashes on boot (of course there's also no bluetooth functionality). I don't know anything about FreeRTOS but in the ble example of bouffalo_sdk, they call the function using xTaskCreateStatic - does that maybe make a difference?

Also, bouffalo_sdk v2.0.0 was released two weeks ago and they changed the examples for ble a bit. One thing that's immediately noticeable: There's something called rfparam_init, maybe this needs to be done before initializing the ble_controller?

image

I don't know if bouffalo_sdk v2.0.0 can be integrated into this code or if there's even a plan to do so. Maybe this would resolve this issue? :smile:

PS:

howels commented 1 year ago

Same here after update to 2.21, black screen unless button held during the boot. Disabling Bluetooth makes it boot ok every time.

River-Mochi commented 1 year ago

@Ralim UART testing: since the screen is either good when I plug it in (most of the time) or black. I'm not sure how to set up UART exactly for what you need collected.

in the past for UART, I connected to windows 11, opened device manager, find the com port pinecil uart device connected to and then open Putty to that specific comm port. I have it connected now and I see data on putty when I do various things with pinecil and pineSAM. I don't think this is what is wanted though?

Sample Putty screen showing UART comm:

I need more info on:

  1. how exactly to set up for this specific test to get data that would be helpful? (not the physcial plug into UART tester hardware, I have all that set up).
  2. is there a way to send this all the putty to a log file so I don't have to watch it and I just save a log somehow in putty to submit to you? do I have to actively watch the putty screen?
  3. If I plug in pinecil and I don't have a black screen then is there any point to collecting UART info from Putty in that case?
  4. is there a special build of FW that could be loaded on pinecil to get the better UART data that is needed?
  5. since it's either black when I plug in or it's all fine then how to best capture the correct info needed in uart? so I'm not just looking watching UART info that is not needed?
  6. Do I just hook up the tester and keep unpluging the cable to the charger until the boot shows screen is black and then you want a screen capture of what is on Putty?

Each time I plug and unplug the charger power. the Putty window stays open as it's still connected to the same com port and I get this when screen is working and turns on right away. This is when it's working fine. Since it's only been black screen for me on boot up about 2 times in past 2 days I was not set up in putty when it happened.

PSandro commented 1 year ago

I receive the breakout board today and could observe the UART while the pinecil failed to boot: Here's the full output:

show flash cfg:
jedec id   0x000000
mid            0xC2
iomode         0x11
clk delay      0x01
clk invert     0x01
read reg cmd0  0x05
read reg cmd1  0x00
write reg cmd0 0x01
write reg cmd1 0x00
qe write len   0x02
cread support  0x00
cread code     0x00
burst wrap cmd 0x77
-------------------
Enable IRQ's
Pine64 Pinecilv2 Starting
BLE Starting
            Trap_Handler
mcause=30000007
mepc:23023d7e
mtval:00000000
Store/AMO access fault

EDIT: that was output with IronOS compiled by myself. Now the output above is from freshly flashed firmware from releases

PlazmaZero commented 1 year ago

I also got the black screen updating to 2.21 and disabling Bluetooth also fixed it for me.

Ralim commented 1 year ago

@PSandro Thank you for this, That is indeed actually really useful; I suspect its faulting inside the BLE stack somewhere. I'll try first making it be a delayed start to see if that helps. If not will have to figure out more in depth answer.

PSandro commented 1 year ago

@Ralim could you reproduce this on your Pinecil v2 yet? I still don't understand why it only happens to some people and only on some boots... For me it faults in roughly 70% of the boots. I looked up the instruction of the pc where it faults. If I'm not mistaken then it's inside some malloc, which itself is probably called within the BLE stack. Maybe there's not enough heap?

For the delaying: I tried delaying the BLE initialization (up to 5 seconds) and even started it in a dedicated thread. The fault still happens.

sj-louw commented 1 year ago

I have compiled the latest (as of 2023-04-23) dev branch with increased total HEAP https://github.com/Ralim/IronOS/blob/5faa092eb6b3c570ae48557030092883748a8a20/source/Core/BSP/Pinecilv2/FreeRTOSConfig.h#L17 from ((size_t)1024 * 8) to ((size_t)1024 * 14). Also, I have applied this fix https://github.com/bouffalolab/bouffalo_sdk/issues/30 in hal_mtimer.c and cannot reproduce the black screen anymore. To me, it seems to work OK now. Maybe someone else can also test the pre-compiled Pinecilv2_EN.bin, attached?

Pinecilv2_EN.bin.zip

PSandro commented 1 year ago

I have compiled the latest (as of 2023-04-23) dev branch with increased total HEAP

https://github.com/Ralim/IronOS/blob/5faa092eb6b3c570ae48557030092883748a8a20/source/Core/BSP/Pinecilv2/FreeRTOSConfig.h#L17

from ((size_t)1024 * 8) to ((size_t)1024 * 14). Also, I have applied this fix bouffalolab/bouffalo_sdk#30 in hal_mtimer.c and cannot reproduce the black screen anymore. To me, it seems to work OK now. Maybe someone else can also test the pre-compiled Pinecilv2_EN.bin, attached? Pinecilv2_EN.bin.zip

Hi, I tried the binary you provided, sadly for me the blackscreen / Store/AMO access fault still happens. Also tried building a firmware with the patches applied that you mentioned - same issue.

Is there anything I can do to help debugging this? I wanted to get a backtrace through JTAG using OpenOCD and GDB. Unfortunately I don't know how to debug FreeRTOS applications with multiple Threads with that setup, my breakpoint on bl_stack_start never gets reached. Anyways... the mepc at fault points to an instruction inside malloc_r. I don't know what to do with that info - has this something to do with newlib and FreeRTOS?

sj-louw commented 1 year ago

@PSandro Not sure, but I have found this: https://nadler.com/embedded/newlibAndFreeRTOS.html

Ralim commented 1 year ago

Hia, I tried adjusting the linkerfile to try and have some more ram allocated to the heap. If you could give the dev builds a spin would be great. If that doesnt fix it, I'll try increasing the freertos heap as well.

sj-louw commented 1 year ago

Hia, I tried adjusting the linkerfile to try and have some more ram allocated to the heap. If you could give the dev builds a spin would be great. If that doesnt fix it, I'll try increasing the freertos heap as well.

Hi @Ralim I have tried your memory-tweaked build, but now the PinecilV2 is in a reboot loop. Maybe related to https://github.com/Ralim/IronOS/issues/1680 ?

PS: Yes, seems like https://github.com/Ralim/IronOS/issues/1680 might be the issue. After plugging it into a wall socket USB, it does not reboot.

NalleBerg commented 1 year ago

Describe the bug After flashing by blisp to v2.21 my pinecil v2 looks completely dead.

To Reproduce

  1. Flash v2.21 (Pinecilv2_EN.bin)
  2. iron looks dead, no display on screen

Expected behavior normal work

Details of your device:

  • Device: Pinecil v2 bought on pine64 site, so should be original one.
  • Release: i don't know how to see. in flash mode it doesn't show anything on screen.
  • Power adapter being used: tried both Type-C and 5525

Additional context v2.21 it seems doesn't work on my Pinecil v2. Thankfully, flashing v2.20 brings iron back to life.

I had the same problem. I solved it by flashing it twice. First I set it in programming mode by holding the [-]-button while inserting the USB C cable, attached to my PC and holding it for 15 sec. Then the screen is black, but the PC recognises the Pinecil. After that I flashed it with the command: .\blisp.exe write -c bl70x --reset .\Pinecilv2_NB.bin I got a «Flash failed» message on the command line. Then, without removing the USB-cable, I flashed one more time and then it flashed v2.21 just fine and I am now using it with the new firmware just fine.

It might have been just a fluke, but it worked for me; maybe it'll work for you too?

aguilaair commented 1 year ago

@Ralim mind providing the .bin artifact? Unfortunately, the GH Action artifacts are not visible to the public :(

discip commented 1 year ago

@aguilaair Try this: Refactor PinecilV2 Tuning

rcmurphy commented 1 year ago

This certainly fixed it on my brand new Pinecil v2 (I thought I was just being an idiot until I checked the issues 😅 )

Spoke too soon; would not reboot

Flashed 2.20, then that update directly on top of it, and now its happy? 🤷‍♀️

River-Mochi commented 1 year ago

Flashed 2.20, then that update directly on top of it, and now its happy? 🤷‍♀️

@rcmurphy Good news you got 2.21 working with the beta fix for 2.21

  1. did you need to update to 2.20 first and then this 2.21 beta second to get it to work?
  2. did you use Blisp or Pineflash to do the update?
  3. does this beta 2.21 work for you with Bluetooh "ON"?
  4. for example bluetooth on and you are also able to use one of the pinecil bluetooth apps here (e.g., joric's webpage graph or builder555's PineSAM) ?
rcmurphy commented 1 year ago

Yes, it all works, but it seems pretty unstable- I've unplugged it and had to reflash to get it working again.

aguilaair commented 1 year ago

@aguilaair Try this: Refactor PinecilV2 Tuning

After testing it, trying 2.20, and flashing this, I still experience the same symptoms. I can get it to eventually boot by unplugging it and plugging it back in, in which case it works, and Bluetooth is working.

Flashed with pineflash on MacOS

Ralim commented 1 year ago

Any chance anyone who has crashing on newer builds also has a way to get the crash information from the UART?

vadimcreates commented 1 year ago

Hi, have the same issue with Pinecil V2 on fw 2.21. I had to turn off Bluetooth to make it working normally otherwise screen is black. @Ralim would provide more diag. info if you instruct me how to do it :)

Ralim commented 1 year ago

Diagnostics info comes out the uart connection, if you have a breakout that lets you view the uart that makes it easy. (uart is pinned out on the extra USB3 pins).

If you dont have the hardware handy for that dont panic. Other ways to help include trialling my attempts to figure out the issue.

On that, I've made this draft PR for increasing the stack size of the thread that starts BLE; if anyone could give it a spin and report back would be great. https://github.com/Ralim/IronOS/pull/1706

aguilaair commented 1 year ago

1706 has the same issues. Unfortunately, I do not have a breakout right now

vadimcreates commented 1 year ago

Diagnostics info comes out the uart connection, if you have a breakout that lets you view the uart that makes it easy. (uart is pinned out on the extra USB3 pins).

If you dont have the hardware handy for that dont panic. Other ways to help include trialling my attempts to figure out the issue.

On that, I've made this draft PR for increasing the stack size of the thread that starts BLE; if anyone could give it a spin and report back would be great. #1706

@Ralim I think I have the hardware but I don't know how to connect the pins. Should TX2+(B2) and TX2-(B3) from USB-C be connected with RX from USB to TTL adapter? Or the setup is different? I have putty installed that can easily connect to my USB-TTL adapter. If you can just give me a clue how to set it up, I will try to extract the logs for you. 20230614_193204

PSandro commented 1 year ago

Hey, here's the output of my pinecil v2 (still) having the issue. I've flashed the Pinecilv2_EN.bin from the artifacts of dev build. SHA256 sum of the bin file is 296bd431a5273ae9752de10261373e6c444da29d76760e7fc0c68384693cf9e1.

I confirmed on the pinecil, that it is indeed version v2.21.CAA638C built on 14-06-23. Please let me know if I can be of any further help in fixing this :)

show flash cfg:
jedec id   0x000000
mid            0xC2
iomode         0x11
clk delay      0x01
clk invert     0x01
read reg cmd0  0x05
read reg cmd1  0x00
write reg cmd0 0x01
write reg cmd1 0x00
qe write len   0x02
cread support  0x00
cread code     0x00
burst wrap cmd 0x77
-------------------
Enable IRQ's
Pine64 Pinecilv2 Starting
BLE Starting
            Trap_Handler
mcause=30000007
mepc:23023dd6
mtval:00000000
Store/AMO access fault
Ralim commented 1 year ago

alrighty I pushed up a different solution that would be great for someone to test if they could

PSandro commented 1 year ago

Thanks for the updated, I can still reproduce the bug. But now it seems to happen less often. sha256sum: 25d9058c6625103706aaf090d0ca2cc87c18521b1778ed5e4562a383b8e5553f Pinecilv2_EN.bin version: v2.21.4EF3584

output:

dynamic memory init success,heap size = 48 Kbyte
show flash cfg:
jedec id   0x000000
mid            0xC2
iomode         0x11
clk delay      0x01
clk invert     0x01
read reg cmd0  0x05
read reg cmd1  0x00
write reg cmd0 0x01
write reg cmd1 0x00
qe write len   0x02
cread support  0x00
cread code     0x00
burst wrap cmd 0x77
-------------------
Enable IRQ's
Pine64 Pinecilv2 Starting
BLE Starting
            Trap_Handler
mcause=30000007
mepc:23024074
mtval:00000000
Store/AMO access fault
sorgelig commented 1 year ago

I wonder why Pine64 cannot help with nailing the issue. They have many pinecils and definitely can find those not working after update and then test it thoughtfully or even send to Ralim. Didn't Ralim advertise them as most responsive and community oriented team? 3 months have passed and source of problem is still unclear.

Ralim commented 1 year ago

Thank you @PSandro

That is excellent to hear, I suspect its still and out-of ram condition. I might upload some extra firmware for you to test. To answer a question I've been wondering, any chance you could scan and see how many ble devices are nearby to you? I'm wondering if this is either it doing a scan and OOM'ing or some random device querying something it cant handle.


@sorgelig

So there are a few things that need to be set straight before we go into this and before anyone starts yelling at Pine. (1) BLE is not a Pine advertised feature and in all design work it is not a core function of the device. This is the reason its not on the store page (Which is run by Pine) but mostly only on community pages. This decision as far as I understand was because the BL70x is a young chip on an unstable platform, and trust is not yet built up with buffalo that its bug free (for reference it definitely has bugs).

The people who should be in question for helping here are buffalo but I'll get to them in a second.

(2) Thus all BLE is down to me deciding to build it because of community asking for it. It took a while to take the very poor mcu_sdk release at the time and cull it down enough that (1) I could ship it and (2) it sort of worked. It was always going to be a second tier goal / priority as its not required for the operation of the device at its core. and coupled with (1) It not being an advertised fully supported feature.

(3) One can note that Buffalo no longer supports BLE or even the radio at all in the bl70x in all of the newer revisions of the SDK. I'm running on an older SDK just so we can have BLE, even though it means its a mess for support; and will eventually become its own nightmare.

(4) Pine64 have been great for communication and support, and its not like they don't know about this. But they cant really do anything here. Its down to Buffalo at this point. There are lots of other aspects of Pinecil that they do help with often but this is not one they have any control over. Buffalo have decided on their current trajectory and as of present it doesn't look to include much for the BL70x. The vibe is that they are focused on other devices at this point in time.

(5) Additionally its been quite hard to get reliable reproductions of it, as some tests factory side didn't show the issue, nor have any of the devices I have here or other Pine staff. And yes these are across multiple batches of devices. And I've already confirmed with pine its same BL702 part number being fitted etc.


So at the moment given the above, its a bit of a goose chase to debug what is OOM'ing and not handling itself correctly.

I have narrowed it down to likely areas. If anyone in this thread is comfortable editing the code and re-flashing a bunch to try and narrow down the code I'd love to pair up with you to gain more insight.. Otherwise I'll continue my slow binary search of potential causes. I've found ways to definitely cause this to occur but they still dont 1:1 match up so its slow.

Also keep in mind that yes "3 months have passed and source of problem is still unclear", but also in those 3 months I have not been around much to actually do anything. As this is a project of "when time is available", there are long stretches where I do not open it all. Ive only really spend a few hours on this fragmented over all of that time.

sorgelig commented 1 year ago

Since BLE is an unavoidable part of newer firmware i suggest to turn BLE off by default. More over, it would be wise to clear this flag with EVERY flash of firmware. In that case, if some pinecil has this problem then simply reflash will fix. Those, who have no such issue and would like to use BLE will simply re-enable it after flash. So, firmware can move forward fixing other bugs while this issue can live some time without distracting.

gamelaster commented 1 year ago

Since it is hard to remotely figure out why the crash happens, I can offer someone an fully working Pinecil V2, in exchange of Pinecil V2, which crashes on v2.21 . Shipment would go from/to Slovakia, I will cover the shipment fees. If you are interested, let me know by email on gamiee (at) pine64.org . Thanks!

aguilaair commented 1 year ago

I'd happily ship mine, but it has a transparent case. Do you have one that also has it, @gamelaster?

gamelaster commented 1 year ago

@aguilaair yes, eventually I can swap it. Although I see you are from USA, so that would introduce customs hassle and issues :frowning_face: as it would ship to Europe, so it might not be good idea. Although, thank you for offer!

aguilaair commented 1 year ago

I'm currently in Spain! I study in the USA but am back in Spain for the whole summer :D

gamelaster commented 1 year ago

@aguilaair ohhh, great! Can you please contact me on Discord (gamelaster), Telegram: ( gamiee ) or on email gamiee (at) pine64.org ? Thanks!

aguilaair commented 1 year ago

Sent, @gamelaster! (email)

gamelaster commented 1 year ago

Probably closed by accident. I received affected Pinecil V2 from @aguilaair (once again big thanks!), and I will try to look on it ASAP. Hopefully we can fix it soon.

rcmurphy commented 1 year ago

Let me know if you need another!I’m in the US.Best,RebeccaOn Jul 20, 2023, at 14:34, Marek Kraus @.***> wrote: Probably closed by accident. I received affected Pinecil V2 from @aguilaair (once again big thanks!), and I will try to look on it ASAP. Hopefully we can fix it soon.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

Ralim commented 1 year ago

Hello 🙇🏼

Would like to update here that I've posted a PR which has some new build assets to test at #1756. Would be amazing if we could get some more testing by anyone affected or not affected. The hope is that this (1) resolves the crash, and (2) gets BLE working. I'll take (1) but realllllllly hoping for (2).