vedderb / vesc_pkg

Official VESC Packages
GNU General Public License v3.0
33 stars 42 forks source link

New package: Refloat #40

Closed lukash closed 3 months ago

lukash commented 6 months ago

My new package, Refloat, is in beta now. I'm posting this PR in advance to discuss the details of merging it.

I'm decidedly against squashing the commits into one. I know in a repo of multiple packages this creates a mess, but FWIW the development of "native" packages also happens directly in this repo.

I care about git history and since this repo is the de-facto upstream for all other packages (meaning they need to be on top of this repo's history), they would not get the full history of other packages if it's not in here. What I can at least offer is a decent quality history with very little WIP back and forth, the history is very clean and the commits well structured.

Going forward I really believe this repo should be split up and some mechanism used to pull separate package repos and use refs to mark versions to release. An issue is the shared c_libs code, which would need to go into a separate repo that each package would include in some way. More work, but I believe at this stage it's becoming warranted.

Another thing is, I'm using lefthook for git precommit hooks (I'm not thrilled with it but I want something to run clang-format), it's got a config file in the repo root. A separate repo would resolve this. Same goes for github CI, which I'll want to hook up very soon, and that also has a config directory in the root...

vedderb commented 6 months ago

That looks like a lot of work, well done!

I'm ok with having all commits in this repository. Even if there are a lot of commits with different quality you can still get a detailed history for individual files without too much noise. Another way would be to use git subtrees. Then your repository has the main history and this one has a copy that is synchronized when needed, similar to how LispBM is included. I prefer subtrees over submodules as I want this repository to be independent of having access to working links to other repositories when cloning it. Then when you want to make a "release" you make a pull request that synchronizes your subtree in this repository with yours, optionally squashing all commits depending on what you think is best.

Regarding github CI, I don't want to depend on github more than necessary for various reasons I won't go into now, so I will not set that up in this repository at least.

lukash commented 6 months ago

Thanks!

Yeah I saw the git subtrees you're using for LispBM and such. Haven't worked with them, but I'm not a big fan of submodules, I'd need to test subtrees a bit but I'd say at least for syncing the c_libs into separate package repos they would seem suitable. For pulling packages to release them, you could really just have a list of repo URLs each with a particular git ref, clone that and build the package... no need to keep them synchronized in a single repo I'd say. But it's also a possibility.

What about the .lefthook.yml in the root? Do you mind? It's limited to the refloat/ dir and only does anything when you have the lefthook git hooks installed.

About the CI, I'll really want it for my repo once any contributions could come along and it seems it might be sooner than I expected. I'll need it to ensure clang-format is applied in an automatic way. The "dependence" on github CI is really "superficial" in the sense it's only some ~20 lines of yaml to configure. It's using infrastructure given us for free. If you'd be to move elsewhere with hosting your repo, it's not that hard to replace the CI with another one.

That said, again, would you mind if I (somehow, I think it's possible) limited the CI to only run on my fork?

vedderb commented 6 months ago

I don't mind lefthook if the consequence of having it there is just the fact that the file exists. Same for the CI or files for other tools. The only type of files I want to avoid having in the repository are large files that make it grow too big (e.g. videos, toolchain binaries etc.). Cloning and dealing with a several GB repository has so many downsides.

The best way to deal with contributions from other developers is probably to do that in your repository, then you don't depend on me for reviewing all the details. You just make a PR to this repo when you are ready to make a release and get the package into VESC Tool.

Regarding subtree, I think you can do the following: Your repository only contains refloat. There you do all the development and build the package manually for testing using the makefile. For finding c_libs your makefile could check a few different paths in case you don't want to place them in the same place relative to my repository. Alternatively, you can also have a complete copy of c_libs as they only are a few kb. Then to make a release, you have a copy of vesc_pkg, update the subtree in it and then make a PR to my repository with that update.

lukash commented 6 months ago

I've added CI to my repo (and subsequently to the PR, is it's from the main branch). I limited the job to my repo via a condition, but the result of the condition on other repos should be the GitHub Actions job is still displayed and marked as skipped. I expected this to happen in this PR, and was about to suggest you disable GitHub Actions in the repo settings. Did you perhaps do this pre-emptively?

If you have GitHub actions disabled, I think I'll remove the condition from the CI, so that it's possible for others to run it on their forks.

FWIW1: I think it'd be really good if you had the CI on your repos. It's an added value, if you'd be moving off GH or anything, you'd lose it, and be back to the state you were before. Presumably, it's possible to replace the CI on other platform...

FWIW2: I managed to get the full package build running in the CI. Honestly it felt I'm crossing the boundaries into the arcane with the technologies I had to employ :joy:. It uses NixOS infrastructure to build the vesc_tool and some GitHub-provided cache for the build results... I'm a NixOS user, otherwise I'd never be able to piece this together. Disadvantage is, the whole setup will be really inaccessible to common users, NixOS is niche and has a steep learning curve. None of this would be needed, if vesc_tool was readily available in Ubuntu...

lukash commented 6 months ago

I don't mind lefthook if the consequence of having it there is just the fact that the file exists. Same for the CI or files for other tools.

:+1:

The only type of files I want to avoid having in the repository are large files that make it grow too big (e.g. videos, toolchain binaries etc.). Cloning and dealing with a several GB repository has so many downsides.

Agreed and I'm sure I'm even stricter in what I'm wanting to commit to git. No auto-generated files should be in there either :grin:.

The best way to deal with contributions from other developers is probably to do that in your repository, then you don't depend on me for reviewing all the details. You just make a PR to this repo when you are ready to make a release and get the package into VESC Tool.

That's definitely the plan.

Regarding subtree, I think you can do the following: Your repository only contains refloat. There you do all the development and build the package manually for testing using the makefile. For finding c_libs your makefile could check a few different paths in case you don't want to place them in the same place relative to my repository. Alternatively, you can also have a complete copy of c_libs as they only are a few kb.

I think having a subtree of the c_libs in a repo that would contain a single package would be a pretty good way to have the shared code in a package repository that's otherwise independent.

Then to make a release, you have a copy of vesc_pkg, update the subtree in it and then make a PR to my repository with that update.

This is what I don't like, it's an unnecessary added complexity with having a subtree of a package repo in your "all-packages" repo. (note I still need to try out how subrepos work, but...) There's no need to "copy" the codebase into your repo, you can just have a very simple reference in any suitable form. Simplest thing I can think of would be a text file with one repo and git ref on a line, e.g.:

github:lukash/refloat/v1.0.0

(The format used is the one that NixOS uses for git refs, it can be anything that works for you, the URL I guess would be: https://github.com/lukash/refloat/tree/v1.0.0, or something similar).

Then you'd "basically" just add a git clone command to your script that's building the packages and puts them into the VESC Tool package repository. There's a number of details that can be tweaked but the whole process is not really complicated.

Then, for me to submit a new version of the package, I'd just submit a PR with an update to the git ref to use for the new version.

The concept is basically the same as having a git submodule, subtree or whatever else, it just differs in some ways and the goal is to make it as simple as possible and automate the parts that can be automated.

vedderb commented 6 months ago

Just to be clear, the thing I want to avoid is having this repository depending on some links working. That is the main reason I prefer subtree over submodule. The repository itself, when checked out from the main source, should contain everything needed to build all the packages. If this leads to an excessively large size there might be reason to do it differently, but I don't think that will be a problem here.

lukash commented 6 months ago

Why do you need that? It's effectively a duplication, an unnecessary copy. Well designed systems (in my opinion and in my experience, the ones that worked the best were built this way) avoid these copies to reduce complexity and ease of maintenance. Unless there's a good reason, which I don't see here.

To explain the difference to making the copy of c_libs (e.g. using git subtree) in a package repo, which could seem a similar case: You need a copy of c_libs to build the contents, and most users of the repository will want to build it. It's a great simplification for all users if the data is there directly, instead of having to pull the dependency via some tool.

In contrast, the "repo-with-all-packages" is mostly just for you to pull everything together for building to the package registry (I'll call it that to disambiguate with "repository"). The amount of duplication is much bigger and it add a non-trivial burden on package authors to sync the subtree for a release.

To be clear, when I talk about "repo-with-all-packages", I'm talking about some theoretical evolution, not necessarily about the vesc_pkg repo the way it is now. You're probably suggesting for me to have my package as a subtree and want to carry on with the rest of the packages the way they are here now. I think this is going to be detrimental long-term, as everything is on one big pile with a multitude of approaches of pushing the updates.

I've already had a couple of people contact me and express their intent to adopt the Refloat codebase and my intent is to provide a quality base package that people could fork and experiment with, having good starting structure and everything easy to tweak or replace. If I make my package a standalone repo and push it here through a subtree, if they want to publish their own package at some point, they will need to do the same. Not everyone is good with git, and this is going to be an obstacle and a manual process in which they could make errors. And I just fail to see the point of why it'd be needed.

This change will require a lot of shuffling around of code, the later it is done (and I believe it's just a matter of time), the more people it will affect and the more painful it will be. It'd be great to do it now. I can help out if you want.

vedderb commented 5 months ago

Wow, we are on completely different pages here... Maybe there is a miscommunication or something that is obvious to me is not at all clear to you. If my repository depends on tens or hundreds of different links my build will fail every time a link breaks. A link can break if you rename your repository, delete your github (or other) account, change your mind about contributing to my repository etc. I have experienced this so many times with linux-scripts that download things here and there to build something - very often this or that link is broken and someone has to constantly track and fix them. I absolutely want to avoid that situation unless there is a good reason not to do that. The only reason I can think of is repository size, but I really can't see how that can become a problem unless the wrong type of files get checked in.

A "theoretical evolution" is not a good reason to me either as code takes so little space compared to the effort it takes to write. If that theoretical evolution happens, it means that such vast amounts of contributions are happening that a lot of things need re-thinking anyway. And before you make the argument "why not make it work for that evolution just in case", the answer is 1) re-read what I just wrote here and 2) it is very hard to predict what the best way is to do something, and things need to gradually evolve to some extent and 3) if that revolution happens and there somehow is an explosion of contributions, chasing links becomes even more of a nightmare.

When it comes to ease of contribution, I don't want to limit this to git-experts. A subtree is one way, but you can also just make a PR that adds your files. A subtree is just a way that makes it easy to copy in another repository and see exactly at which commit the copy was made. As long as all source-files needed to build the packages end up in this repository without having to fetch them from links here and there I'm happy. It is not a multitude of ways to contribute either, it is just a pull-request that adds files to the repository. Those added files can be just the files you want to add/update or it can be a subtree-sync which also just updates a bunch if files in the end and can be reviewed in the same way.

One more point: If you want to make a contribution you must be able to get and build this repository anyway. You also need to have VESC Tool on your computer as well as git and make. Therefor it is important that this repository is easy to build if you have those tools. I want you to test your pull request and make sure that it builds before you contribute, otherwise I have to do that for you. So for making contributions to this repository, you are in a situation where you need to download and build it first anyway; and do you know something that is more troublesome to deal with than a repository full of things? Answer: a repository that has to be filled with the very same things from a bunch of potentially broken links.

lukash commented 5 months ago

Ok, now I understand where you're coming from, but I'm afraid we are still on different pages. I guess we both have a fundamentally different point of view on this.

IMO it's not you who is the maintainer of a (random) VESC package. It has its own maintainer. That maintainer is responsible for the package being at the link they gave you, for it building successfully and being functional.

You just maintain the package registry (again, calling it that to avoid the ambiguity of the word repository with a get repository). If a package fails to pull or build, you skip it, build the rest of the packages and you raise an error in the build script so that you get a notification. You ping the maintainer, if the package is dead, it's dead. If it's important to you, you may take over maintaining it. Otherwise, it's not really your job to fix the package. You should still have the older builds of the package so that people can still download and use those.

This should actually result in less work for you long term, not more. With having the packages in your repo, you kind of take the responsibility of updating them to newer APIs, toolchains or whatever on yourself. If they're out of tree, it's not your problem (except for making sure there's an upgrade path for them from your side).

It's a common practice that is being used and works for whole huge ecosystems of packages (to just pull from git URLs and compose those), it's working well and there are multiple reasons why it works best this way, I won't go into lengthy explanations anymore.

... A subtree is one way, but you can also just make a PR that adds your files. ...

Yes, but it adds a whole host of issues with long-term maintenance as all the packages are in one repo, and it instigates more bad practice when people not skilled with git try to cope with merging the changes coming from different directions. Then it becomes overly complicated and doesn't scale at all.

lukash commented 5 months ago

But let's leave it, it's not something I'd ask you to just start changing to right now anyway. I just wanted to agree on some sort of a direction, but anyway, I decided to try and do it the way you suggested: Make my repo separate and integrate with vesc_pkg via subtree.

We can see how it works, and having my repo separate is the more important thing anyway.

What I'd like to do is have c_libs as a separate repo as well, which I will include as subtree in Refloat. I'll create that repo, if you agree with this approach, I'll make it work in Refloat, and then either you can take the repo over or I can keep maintaining it.

Refloat will just build with that internal c_libs copy, I wouldn't like to switch to the one in vesc_pkg.

What do you think?

vedderb commented 5 months ago

It is incredible to me how you dance around to insist in being right. I'm making a VERY simple point here: if my build depends on links and they break, I have to fix that for my build to work again and I have to fix possible dependencies. If everything is in my repository that simply cannot happen. Your arguments about long-term effects are very vague and look like an attempt to prove your point rather than trying to be helpful.

You make another point: maintenance of packages and updating them. This is a responsibility I take on myself as I think it is important. I don't want things to break as it is generally frustrating to work against a system where things are not compatible between versions. I try to take care to not make old things incompatible with the latest developments as I think that is worth doing. There might be changes where there is no choice and that might give me some extra work with existing packages (something I CAN fix btw if they are in my repository and not if they are elsewhere), but I can't see that being a major problem. If packages get abandoned by users and/or developers I can still decide if and when to remove them as they are entirely in my repository and don't suddenly disappear because a link breaks.

Anyway, this is the situation:

  1. This repository will not be split into links, everything needed to build the packages will remain in the repository.
  2. If you want to contribute your package you can pick a way that works for you, as long as that way ends up in a pull request with files needed to build your package. You can work directly against the repository and add all your commits to the history, you can make a subtree or you can just add all files at once. All of that works for me and I think those options give everyone a good opportunity to contribute regardless of their skills with git and build systems.
  3. Regarding c-libs, if you want them in a separate place for your repository that is up to you, but I will keep a copy of them in this repository. When you make a pull request your package also needs to build without downloading or referencing them elsewhere. If you want to keep your own copy of them in your package directory that is fine with me.
lukash commented 5 months ago

It is incredible to me how you dance around to insist in being right.

Excuse me, I don't dance around, I'm trying to make a simple point just like you are. And in that way, I'm trying to be helpful. At least that's my intention. We disagree, that's ok, although obviously not ideal. I'm sorry it's frustrating you, I only wanted to understand your reasoning and present the arguments against that specifically. I didn't convince you and that's the conclusion of the discussion. I find this to be a better outcome than not having the discussion and not knowing what are the other person's reasons for doing things the way they are.

I'm still ultimately weighing my options because none you give me are really good on my end of things. Not sure you understand that... :confused:

vedderb commented 5 months ago

I don't want to come across as unreasonable, so here is one example of something you wrote that bothers me:

image

This does not tell me anything at all. How does it instigate bad practice? How does it become overly complicated? Why does it not scale?

I have no idea what I'm supposed to to with this. To me it looks like random unbacked claims somehow coming from you wanting to do it in a different way. You have addressed none of my points with anything practical and you keep coming with these confident-sounding claims. That is why I am getting frustrated.

Here is one example from the discord:

image

Sounds like you are just claiming that you work in some "professional industry", therefore you must be right.

So many of your claims come across like this and arguing with them is just frustrating to me as they are so vague.

vedderb commented 5 months ago

On a final note, I'm happy to merge your package and get it into the next build. I think you have done a good job with the development and I would be happy to enable other people to use it. I hope we can find a way to avoid these endless conversations in the future and find a more constructive way to get to a conclusion.

lukash commented 5 months ago

This does not tell me anything at all. How does it instigate bad practice? How does it become overly complicated? Why does it not scale?

Just found this here, you're right, I didn't really explain this. I somehow haven't realized it and I (perhaps mistakenly) consider some problems of sharing one repo for multiple projects obvious. I thought I'd fill them in, but I don't want to bring back the argument, so I'll leave it...

When I said I'm weighing my options, I meant I'm just deciding between the two possible paths - leave the PR as it is, or migrating to a separate repo. Separate repo seems the more correct way forward to me, but it's gonna be a bit messy (I now have it in a fork of vedderb/vesc_pkg, but I'll need the fork for your repo, github doesn't allow two forks. So I'd need to move the package codebase into a new repo...) and there's more extra work to it.

surfdado commented 5 months ago

One last thing before we make this package official - could we please investigate why we run into trouble updating this package? I for example am completely stuck on beta3 of the Refloat package. I cannot erase the package, I cannot install a new version. So now my next course of action will be a firmware reinstall unfortunately...

Benjamin, could you try installing this package yourself and then see if you can erase that package (after a reboot)?

surfdado commented 5 months ago

I should note that I was able to install beta3 after running beta2, but it took several attempts with lots of erase timeouts and eventually it succeeded. But this time I have already surpassed the number of attempts but the old package just doesn't want to go away

vedderb commented 5 months ago

I should note that I was able to install beta3 after running beta2, but it took several attempts with lots of erase timeouts and eventually it succeeded. But this time I have already surpassed the number of attempts but the old package just doesn't want to go away

Is that behavior unique to this package? It could be caused by the C code not stopping the threads it creates when asked to, that it leaks memory or something similar. It is best to have access to an SWD-programmer while developing C libraries as there are so many ways for the C code to "brick" the firmware.

vedderb commented 5 months ago

When I said I'm weighing my options, I meant I'm just deciding between the two possible paths - leave the PR as it is, or migrating to a separate repo. Separate repo seems the more correct way forward to me, but it's gonna be a bit messy (I now have it in a fork of vedderb/vesc_pkg, but I'll need the fork for your repo, github doesn't allow two forks. So I'd need to move the package codebase into a new repo...) and there's more extra work to it.

Here is one way that I think should look like a separate repo on your end and a PR that adds independent files on my end:

  1. You have a separate repository, including the C-Libs, where you develop refloat, manage contributions from others etc. Your repository can build a vesc-package by itself that can be uploaded manually from vesc tool when you want to test it.
  2. When you want to make a release, you have a fork of vesc_pkg, update your package as a subtree and make a PR to me. Then I can review the changes as usual and merge them if they look ok. Then I have a copy of your files taken at that point in time.
  3. You go on and do development in your repository as usual. When you want to make a release again you update the subtree in the vesc_pkg fork you have and make a pull request to me with the latest version. I review it, merge it and it is done.

If it turns out that this is a bad method on my side it does not affect your repository and I can find a way to merge packages in a different way. Although I don't see any problems with this; you develop your package the way you see fit in your own repository and when you want to get it into the package store I get a pull request with all files that I can review in my way and keep in my repository independently of what happens to your repository until you make the next pull request.

Regarding having a copy of the C libs, I don't think that is such a bad idea. They are quite small and you could even e.g. update the ST libraries to another version for your build of you need that for something. All they need to do is generate and link a binary that is compatible with VESC_IF.

vedderb commented 5 months ago

Another note: After a quick read it seems you can make a fork of a fork. So another workflow would be that you have a fork of vesc_pkg that you develop in. Other can fork your fork and make pull requests to you. Although I think they can only have a fork of your fork if they don't already have a fork of my version in their account.

vedderb commented 5 months ago

Just tested, I can make a fork of your VESC Tool fork, although I can't put it directly in my user as I already have VESC Tool there. I can make it under any of the organizations I'm a member of though.

So you have three options that allow you to manage the refloat development independently: Separate repository with a subtree, separate repository where you just copy and merge files manually or working directly in a vesc_pkg fork where others fork your fork.

lukash commented 5 months ago

I for example am completely stuck on beta3 of the Refloat package. I cannot erase the package, I cannot install a new version.

@surfdado really? That is very serious. When I had these issues it seemed to me it was always after some time of repeated installation of the package. What helped for me was restarting desktop app, mobile app (used as a bridge) and board. One time this didn't help, but I restarted my phone and it helped (??? :shrug: yes, very weird, I blamed the bluetooth).

There was a user on pev.dev reporting something very similar on Spintend 100, but ultimately we found out he can't reflash the firmware, the erase failed on that as well. So I concluded the hardware went toast :slightly_frowning_face: I'm hoping this is not some serious bug that will be bricking people's boards? @vedderb not sure you can make a quick assessment whether we should announce people stop installing the package immediately?

It could be caused by the C code not stopping the threads it creates when asked to, that it leaks memory or something similar.

I think/hope the threads are stopped correctly, but it's possible to reboot via watchdog to rule out this possibility, that won't be starting the threads then. @surfdado could you try that? Reboot via watchdog using the terminal, then try reinstalling the package.

(I don't have more time now, I'll address the rest in the evening)

vedderb commented 5 months ago

Rebooting via watchdog should do the trick if it is the c-code getting stuck somehow. I will try it myself later if I have time.

surfdado commented 5 months ago

Rebooting my Linux PC (no TCP bridge) made no difference. I think this board might be locked to 6.2 with Refloat beta3 forever haha... I'm hesitant to try rebootwdt because my firmware has PIN-Lock functionality and I'm not sure whether FloatControl will let me unlock the board without any package running. Does power cycling after rebootwdt bring everything back to normal? I assume yes but I really don't want to brick this board (it has no USB port and SWD pin access requires a multi-hour job)

surfdado commented 5 months ago

I'm going to test this a bit more on my desktop VESC (ubox75 plus Trampa VESC express) to not risk my other boards. If we are all struggling with this it could be a nightmare for the average user

vedderb commented 5 months ago

Rebooting my Linux PC (no TCP bridge) made no difference. I think this board might be locked to 6.2 with Refloat beta3 forever haha... I'm hesitant to try rebootwdt because my firmware has PIN-Lock functionality and I'm not sure whether FloatControl will let me unlock the board without any package running. Does power cycling after rebootwdt bring everything back to normal? I assume yes but I really don't want to brick this board (it has no USB port and SWD pin access requires a multi-hour job)

Rebooting with the watchdog will prevent the package from starting at the next boot. Then you should be able to erase it normally. Although if you depend on the package for the pin lock functionality and you don't have USB you might be in trouble. That is the tricky thing with C code; there are many ways to brick your device and you should have access to an swd-programmer while doing development. There are ways to write C code that bricks it even worse than this.

surfdado commented 5 months ago

Success!

I forgot that I already supported PIN-locking in FloatControl without waiting for the package to load (for my son's VESC EBikes). So it worked indeed. After rebootwdt I had to PIN-unlock the controller again and then I was able to erase the old package and load the new one.

But hopefully we can figure out why this was necessary in the first place so that the average user doesn't have to go through this process for each update.

vedderb commented 5 months ago

Success!

I forgot that I already supported PIN-locking in FloatControl without waiting for the package to load (for my son's VESC EBikes). So it worked indeed. After rebootwdt I had to PIN-unlock the controller again and then I was able to erase the old package and load the new one.

But hopefully we can figure out why this was necessary in the first place so that the average user doesn't have to go through this process for each update.

Nice that it worked, and it confirms the theory that the package does not stop properly. There must be something wrong in the c-code that prevents it from shutting down. For example, the threads you create have to check if they are supposed to terminate as they are running and terminate if they are asked to. The firmware cannot force them to terminate, so if they do not check and just keep running it is not possible to stop and thus erase the package.

lukash commented 5 months ago

Well, Refloat has this as the main loop, just as Float:

    while (!VESC_IF->should_terminate()) {
        ...
    }

@surfdado do you have LEDs enabled on this board? That's another thread (though it has the same loop condition).

So, I'm going to stare really hard at it for some time, but I'm quite likely not going to come up with much. Seems the request to terminate somehow doesn't go through? I definitely want to fix this though, I don't feel comfortable releasing it to users.

I wonder if this could have been the cause of this guy's board issues too. It was doing other confusing things though, like it rebooted by itself a second after he clicked to install package. It has momentary button, which needed to be held during FW flashing, otherwise it'll brick. Not sure this is an issue of that controller HW implementation or if that's just how it is with a momentary button. On doing rebootwdt, the board would shut off instead, meaning it was not possible to boot it without the package thread this way (even if the power button was held IIRC). When listing threads on the terminal, it would list a much shorter list than expected and it seemed to be slightly different each boot: IMG_0867 (these are two consecutive boots)

Ultimately, he tried to flash FW via VESC Tool and it failed to erase the FW flash too, I suspect that, at least, was probably caused by the package running?

surfdado commented 5 months ago

I have LED Type set to "External" (it's the Floatwheel), my other board has LEDs enabled too but I'm still on beta3 on it - But the same happened with my desktop VESC with just defaults loaded, no LEDs enabled

surfdado commented 5 months ago

And yes, this was most likely what prevented the erase step for a FW load. The same happened to me too.

vedderb commented 5 months ago

If you are using BLE it is likely that not all threads will be listed as packets get lost when printing them in the terminal. Can you try connecting over USB for debugging this problem?

Can you see if it prints the terminating-message when trying to stop or erase the package? Also, can you check stackmin for the LED thread?

lukash commented 5 months ago

If you are using BLE it is likely that not all threads will be listed as packets get lost when printing them in the terminal. Can you try connecting over USB for debugging this problem?

Honestly it sounds very weird it skips lines due to poor connection (is the text sent line by line?). But I never observed this behavior, it was a user who contacted me after he got issues with uninstalling Refloat. He was also getting only about 1/3 of the threads listed all the time.

I'll test USB on my dev board (which I'm putting back together, I was doing most of development on my daily driver).

Can you see if it prints the terminating-message when trying to stop or erase the package?

Good idea, I'll be watching that.

Also, can you check stackmin for the LED thread?

I'm pretty sure it's fine, it'd be crashing if it wasn't, and surfdado said it's doing for him without the LED thread too (but I'll re-check it).

I've been having these issue to reinstall Refloat as well, but I always felt they're connection issues, even though I got "Erase error" most of the time, I was getting "Failed to start Lisp" (after possibly a long time of writing, indicating connection issues) very often too and occasionally a "Write error" as well. (And, as I said already, I just kept restarting things until it helped)

Also, it always happened for me only after some time of periodical reinstalling, on the first tries after connecting it has mostly worked very well for me.

vedderb commented 5 months ago

Honestly it sounds very weird it skips lines due to poor connection (is the text sent line by line?). But I never observed this behavior, it was a user who contacted me after he got issues with uninstalling Refloat. He was also getting only about 1/3 of the threads listed all the time.

The text is indeed sent line by line. The problem is not necessarily due to a poor connection. If the BLE module runs out of buffer space before it has time to send the buffer packets will be dropped (each line is one print, so one packet). The threads-command is especially tricky as it sends so many packets in one go and the BLE-module as to keep sending over the buffers fast enough as that happens. If there is a hickup in the connection while sending the threads it is very common that a few lines get dropped. How bad that is depends a lot on which BLE connection parameters get negotiated between the phone and BLE module; the connection parameters and delays have a drastic impact on the throughput and are mostly in the control of the phone.

The reason e.g. firmware updates don't lose packets is that every packed is ACKed and if it is lost a number of retries are made. The terminal-commands just blast out all packets without asking if they got received.

vedderb commented 5 months ago

Did you find out why the refloat-app would not stop properly?

lukash commented 5 months ago

@vedderb not yet, haven't really gotten to try to debug this yet. I've put my dev board back together and I rigged it with a USB C to which I also jerry-rigged the SWD pins to the SuperSpeed pins, so that now I can connect USB and debugger at the same time to an assembled board (which will be extremely useful going forward).

I've done some experiments with debugging the package code by loading a second elf in gdb, I think I managed to load it but I'm not sure that e.g. breakpoints were working properly for me.

TL;DR getting set up to be able to properly debug this and the package in general.

With regards to the not stopping problem, I still think the signal is somehow being lost on the way, but not sure yet. I recalled I was at some points seeing the "not terminating, crashing" message in the Lisp log, which would indicate the stop request timed out (IIRC reading the code). That's all the info I've got as of now.

lukash commented 5 months ago

@vedderb I found the thread stopping issue, it was a simple stupid mistake after all. I did also manage to get gdb working with the package, I need to share that (probably in a pev.dev post).

Still haven't gotten around to flip the repository, sorry about the delays. Hopefully I'll do it soon.

vedderb commented 5 months ago

That is good to hear! It is usually the small stupid mistakes that get you and the hardest to find. Nice that you got a gdb-setup running; if you can share those instructions as a pull request to the readme or a separate markdown-file in the vesc_pkg repository that would be great.

No worries about the delay, no need to rush it.

lukash commented 3 months ago

New PR here: https://github.com/vedderb/vesc_pkg/pull/42