fpv-wtf / wtfos-configurator

Configurator for wtfos, with built in margerine
GNU Affero General Public License v3.0
42 stars 16 forks source link

[LT150] dinitctl: connecting to socket /data/.dinitctl: No such file or directory #244

Closed robustini closed 1 year ago

robustini commented 1 year ago

As I have already documented on Discord this Air Unit Light after rooted regularly does not allow correctly installed packages to run at boot for this dinit issue. But then imparting this command in CLI the two selected services instead run smoothly as you can see in the image, without giving that command i cannot activate them, so I lose them at every reboot:

/opt/etc/init.d/rc.unslung restart

immagine

@j005u has already performed remote session to check what it might be but has not figured out where the problem lies, and from what I read on Discord @stylesuxx had helped someone solve this problem always on an Air Unit Lite, I don't know if he succeeded or not. We have already reflashed slot_1 to the stock version after checking the md5 of ota.zip with these commands, it gave no errors:

unrd slot_1.status_successful 0 unrd slot_1.status_active 0 unrd slot_2.status_active 1 rm -rf /blackbox/wtfos reboot

But even after this operation, after wtfos repair/fix the problem was not solved. Now in my ignorance if doing an initd restart after booting solves the problem one solution might be to create a check on whether the selected services actually start during boot (check of related PID or other), in case they are found to be disabled you run an rc.unslung restart, I think that would be the simplest solution. It is possible that the execution of that services at boot occurs at a time where they find a condition so they cannot be executed? I don't know. If you do not understand where efffectively the problem lies I think it is enough to find an empirical way to skip the problem. Tried with 606 and 608 firmware, same problem. Re-executed reflashing with Assistant as well, but I remain in a dead end. Without these services available at reboot the root in this unit is therefore useless. I've execute root and everything else with the configurator on three other identical units and I haven't had any problems, selected services are performed at boot regularly. I'm in your hands........ :-(

robustini commented 1 year ago

Interesting, don't ask me why but apparently all the services start, i've the msp-osd working in the Goggles!

image

It's the configurator that doesn't intercept them after a reboot for that initd issue, so the services cannot be enabled or disabled in the startup section, I can do it in that page only after an rc.unslung restart in CLI! Tested on two identical Vista Light, one with firmware 606 and the other with 608, same issue. So, after a reboot:

root@pigeon_wm150_tiny:/ # dinitctl list dinitctl: connecting to socket /data/.dinitctl: No such file or directory

And after a "/opt/etc/init.d/rc.unslung restart":

root@pigeon_wm150_tiny:/ # dinitctl list [[+] ] boot [{+} ] wtfos-auto-naco [{+} ] msp-osd-airside (pid: 2736)

So this initd does its job at boot but then something abnormal happens, then dying. I hope this report will be helpful to you in intercepting the problem, although at the moment the important thing here is that the services work on boot.

robustini commented 1 year ago

I also performed a check on the third Caddx Vista Lite that I own, but funny thing is that here instead everything works as it should, initd has no problem at boot, services are configurable in the startup page. So two Vista with this problem, one not, super weird, since I rooted and installed wtfos in exactly the same way. But there is a substantial difference: in the one where initd doesn't give problems after the boot I never updated the firmware (in fact /cache/ota.zip is missing), it came with 606. In the other ones where I have updated the firmware instead initd does that weird thing.

stylesuxx commented 1 year ago

Just to recap, we checked env and mount state before running dinit in the init.d startup script - everything is looking fine so far: /data is mounted and $HOME is set properly.

dinit and the actual msp-osd is also running - that's why the OSD is working. Obviously the socket is not being created for some reason.

One thing we can do now is pull the system partition via ADB from one working vista and one broken one and diff it. After that we can try transplant parts of the system from the working one to the non working one and see what happens.

robustini commented 1 year ago

Performed dump of two units with 606, one working and the other with problematic initd, you have PM on Discord.

j005u commented 1 year ago

Diffed, only thing of note is that the broken system has wtfos-remove-adb populated from a previous uninstall/re-install cycle, but I don't see how that could cause this issue.

robustini commented 1 year ago

@j005u @stylesuxx if it helps I did this, I removed wtfos and root from the right unit running 606, upgraded to 608 and immediately noticed that after the upgrade by turning the unit off and on again the Assistant correctly detects version "1.00.0608" installed. In the other two "broken" units with the initd problem after the upgrade the Assistant shows version "00.00.0000" and the configurator fails to pass attempt 1 during the root automatically, I have to turn off and on the Caddx Vista again to continue with attempt 2. While rooting the good unit instead, attempt 1 resets the unit by itself, and when reconnected it correctly detects the firmware version.

00:21 - Found Device: LT150, Version: 01.00.0608

In the problem units, however:

00:09 - Found Device: LT150, Version: Unknown

And it is on these two that then initd gives that problem. It can't be a fluke. In the good unit that I upgraded to 608 of course the services start automatically without any problem. Another interesting thing: in the good unit after the umpteenth root I had no report of corrupt wtfos to repair, instead in the two failed ones it does it systematically. In short there are strange but identical behaviors in the failed units.

robustini commented 1 year ago

So in my opinion if I can make sure that after updating with Assistant the firmware version is correctly detected during rooting according to me then initd will work as it should. Now the problem is how to make that "00.00.0000" go away, how to make sure that after the update it gives me the correct version. Is that possible or not?

stylesuxx commented 1 year ago

Interesting. I have seen people not being able to root with 00.00.0000 and the fix is to refresh the 0606 firmware via DJI assistant. So maybe this is something that you could try...

robustini commented 1 year ago

Interesting. I have seen people not being able to root with 00.00.0000 and the fix is to refresh the 0606 firmware via DJI assistant. So maybe this is something that you could try...

So you mean to downgrade from 608 to 606 in the units that show 00.00.0000? I thought I had already tried and it remained "00.00.0000", triple damn. Where is this string saved? Inside something in /data? The trick to root with 00.00.0000 is what written above. Turn on the unit and after clicking "ROOT" turn it off and on again immediately. If a few seconds pass it will not work, the configurator shows a blank page.

robustini commented 1 year ago

If comparing the system and wtfos in the units turn out to be identical in my opinion we should now focus on the /data, it is perhaps in there that something important changes. My two cents...

stylesuxx commented 1 year ago

The version string is saved in an XML - the DJI assistant will also show you the 0000 version. This definitely should not be this way. Yes, some can root with this broken state, some can't, some can after doing weird things. The correct thing would be to downgrade to 0606 and make sure that this is also the version being displayed (during root and the DJI assistant - they both use the same location to read this version number).

It's probably not an issue with /data per say. It is an issue with the socket not being created there for some reason, or rather it being created, and being deleted somewhere in-between.

Is your working device showing the correct version number in the assistant?

robustini commented 1 year ago

After the flashing with the Assistant and before the root yes, and turning it off and on again show the same version. While the other two go to 00.00.0000 after the first repower before the flashing, and in fact then show the message "Found Device: LT150, Version: Unknown" during rooting. It cannot be a coincidence that those two units then show the initd problem and the other no. It would be curious to know if those who managed to root with "00.00.0000" also have that socket working regularly.

robustini commented 1 year ago

Unfortunately, the "00.00.0000" problem is not solved by downgrading from 608 to 606 or with a firmware refresh, I just tried after removing wtfos and root. The incorrect version problem was obviously generated after rooting, originally it did not. This is the sequence:

This is before downgrading image

Successful downgrade image

Without restarting Caddx, the Assistant shows the correct version image

After restarting Caddx here comes the error again, and this is strange if before rebooting instead it showed the correct version image

Obviously if I remove root from the unit that works fine with initd even after any upgrade, refresh or downgrade it always shows the correct version installed, even after reboot. If we cannot figure out how to get out of this deadlock in that units imho the initd problem I think cannot be solved easily.

robustini commented 1 year ago

Browsing through the chat on Discord and searching for "00.00.0000" I see that others have had the same problem.

stylesuxx commented 1 year ago

Can you please use the latest version of DJI Assistant 2 (DJI FPV series)?

robustini commented 1 year ago

Can you please use the latest version of DJI Assistant 2 (DJI FPV series)?

Tried it now, same result, that 00.00.0000 won't go away even if i shoot it, damn.

robustini commented 1 year ago

It should be understood where he is and why he stays. Could a /data dump of the working unit and this fucked one with the same firmware version be useful to figure out how we can solve it?

stylesuxx commented 1 year ago

It should be understood where he is and why he stays.

Yeah, you know - it's not that we have source and documentation to look into as reference ;-)

Could a /data dump of the working unit and this fucked one with the same firmware version be useful to figure out how we can solve it?

Highly doubt it. I mean sure, go ahead, dump it and run a diff on it. I don't suspect the problem to be there. I guess there is a corruption in some place. It would be interesting to find out when the firmware get's set to the 00.00.0000 state...

j005u commented 1 year ago

Using butter to reset to 0608 and then further re-flashing through the assistant solved the problem.

Closing.