Continuous Integration / Continuous Delivery woes

machinekit / machinekit-hal

Universal framework for machine control based on Hardware Abstraction Layer principle

https://www.machinekit.io

Other

105 stars 62 forks source link

Continuous Integration / Continuous Delivery woes #268

Open cerna opened 4 years ago

cerna commented 4 years ago

Tracking progress:

[x] Eat the Dovetail-Automata/mk-cross-builder repository
[x] Eat the zultron/mk-cross-builder repository
[ ] Start building/testing the Machinekit-HAL Debian Bullseye packages
[x] Automate Docker images build and upload to Github Packages
[x] Sign built packages with a key using dpkg-sig
[x] Upload packages to Machinekit/Machinekit-HAL Cloudsmith.io repository
[ ] Run CMOCKA tests
[ ] Run Python tests
[x] Rework Debian builder docker image build script to use configuration JSON (one point for settings)
[x] Rework Debian package builder script to use configuration JSON (one point for settings)
[x] Run amd64, armhf and arm64 runtests in Drone Cloud CI
[ ] Rework Drone CI yaml script into saner form - JSONScript with autoregeneration by git hook
[ ] Start building packages and running tests on Ubuntu 18.04 Bionic
[ ] Rework Travis CI yaml script into saner form
[ ] Implement saner git hook solution using Lefthook

As you probably know, the Machinekit buildserver (jenkins.machinekit.io) was turned off. It was done quietly, so I have no idea when exactly it happened. It ran test on pull requests and uploaded debian packages into the Machinekit's deb.machinekit.io repository. This is important for the C4 as without testing nothing can be merged, respective everything has to be merged.

So as an emergency measure the Travis build system was turned back on for the Machinekit-HAL repository. In response to discussion with @luminize - where we agreed that some artifact output from CI run would be nice, so users can download the .deb packages and use dpkg -i command to install - I implemented simple Github Actions workflow in the quick&dirty style as Travis doesn't allow keeping artifacts for later download and it's nice for users to have automated build of packages in theirs forks. Github keeps the build artifacts for 90 days after the fact.

Given the volatile situation with *.machinekit.io server, I think that its state should be conserved and package upload should be resumed into the Packagecloud and Cloudsmith repositories (for redundancy). I can start the upload now or when the package build for Machinekit-CNC is ready (currently Machinekit-CNC repository has no CI for testing and package build), but I think that the time for it is after both Machinekit-HAL and Machinekit-CNC can be installed. I also think that it is a time to drop Jessie support (Jessie will be obsolete in 3 months) and make sure that Machinekit-HAL runs and produces packages on Bullseye.

If I understand it correctly, the reason behind own Jenkins server was the 50 minuter running limit of Travis CI. Well, situation is vastly different today with everybody's mother giving Open-Source Projects free minutes on their cloud services. So how relevant is for example this issue for the current status of the project? Given that CI/CD rework will have to include solving the machinekit/machinekit-hal#195 issue, there is quite big window for any changes. Including the build Dockerfiles will also hopefully solve the current issues with image availability. (Machinekit-HAL currently uses my own DockerHUB account, as there are some missing images in the DovetailAutomata account, which caused machinekit/machinekit-hal#263.)

CI tools of today are also using the container-first approach, for which the current script build_with_docker is poorly suited. This script should be left for home use and the build commands exported to functions that then can be called directly from the container.

There is also provider called Drone Cloud which gives Open-Source projects free build minutes directly on armhf and arm64 hardware. This could be used to test Machinekit-HAL not only on amd64 as it is now but also on ARMs.

Last but not least, there are now cmocka tests in Machinekit-HAL. I am all in for unit testing. However, no tests are actually run now and the build server are not even installing the cmocka debian package. So that will also need to be addressed.

zultron commented 4 years ago

Not too important, but for the sake of interest, the history of the CI was as follows. We initially built packages on a buildbot in my shop. The ARM builds were painfully slow because they were running in an ARM chroot on x86 hardware in emulated mode. Later, we moved CI to Travis. ARM builds took longer than the 50 minute limit on the 2-CPU Travis builders, so somebody hacked the build system to split the POSIX, Xenomai and RT-PREEMPT builds into separate builds. That worked to get everything built, except it ended up causing another problem. I think you're right that @ArcEye set up the Jenkins server and recombined the ARM builds in order to finally solve this problem: no time limits on the combined ARM builds. Somewhere in there, I went in and got proper cross-building to work, and later set up the Docker containers we now use; this was an alternate solution, since (combined) ARM build times were reduced to 10 minutes or so on Travis.

The GitHub Actions and Cloudsmith you're looking at weren't around when we were working on this, so maybe there's an easy solution for package distribution in there. Packagecloud's problem is the draconian data storage and transfer limits made it impractical to use for public package distribution. If I were to try again today, and couldn't find any other simple, inexpensive solution like Packagecloud's but without limits, I would use Travis or any other CI system to upload built packages to a VPS running Docker images for receiving new packages and publishing them in a repo. I probably have scripts around somewhere that do this, or I'd likely try to integrate aptly.

Let me know if you need help, but it sounds like you have a more current understanding of the latest tools available today; things have changed quite a lot in the last several years.

cerna commented 4 years ago

Thank you for the historical details about build system, @zultron. I guess this video is related to this subject, right?

Do you have any idea how much traffic the Machinekit project needs? Packagecloud gives 25 GB of storage and 250 GB traffic (currently the watermark is at about 10 GB of stored packages and the traffic is negligible, as it wasn't really used). Cloudsmith gives 250 GB of storage and 1 TB traffic (they say these limits can be upped based on needs, but hard to say from the get go).

I asked on the Debian IRC channel on Freenode if anybody doesn't know of any FOSS project which uses the Cloudsmith service for .deb package distribution, but they just started arguing between themselves about it being proprietary and not on-topic for Debian :roll_eyes: So I don't know how big project it is capable of serving.

Then there is the Oracle Cloud Always Free Tier with 10-100 GB storage and 10 TB transfer limits. And I was also considering using the GitLab Pages for repository hosting. It is possible to maintain this by hand (or this) or probably use the Aptly. They have 10 GB limit on repository and unlimited traffic (yeah, riiiight...).

We will probably have to solve this for the helper repository deb.machinekit.io (I don't have keys).

But in general, I would like for every piece of infrastructure to NOT BE dependent on any specific Machinekit developer or member.

I have created pull request #269 which eats the Dovetail-Automata/mk-cross-builder repository and closes the #195 issue. It would be nice if you could eyeball it. Later, I will take a knife and will go gut your other repository for goodies.

To make it work, I had to add libglib2.0-dev, libgtk2.0-dev, tcl8.6-dev and tk8.6-dev packages into build dependencies of Machinekit-HAL. I am not completely fine with it, mainly because the hanky-panky which goes on with these packages and the Dockerfile. The container happily builds without these packages and then there are dangling symlinks.

cerna commented 4 years ago

Bullseye is a problem. There are missing dependencies:

 machinekit-hal-build-deps : Depends: python-zmq (>= 14.0.1) but it is not installable
                             Depends: yapps2-runtime but it is not installable or
                                      python-yapps but it is not installable
                             Depends: python-pyftpdlib but it is not installable

Given that the same dependencies are in the Zultron/mk-cross-builder repository and there are built Docker images in DockerHUB, they must have deleted them in the last 5 months.

I am going to create a pull request with Bullseye cross builder, as this has nothing to do with it.

But, bloody Python.

cerna commented 4 years ago

There is some related discussion to this issue on Machinekit Google groups Forum.

cerna commented 4 years ago

I have just downloaded pyzmq (source for python-zmq), yapps2 (source for python-yapps) and pyftpdlib (source for python-pyftpdlib) through pip(3) command. Given that there are official Python alternatives, why these packages have to be installed from APT repository?

(I don't understand Python and never liked it.)

cerna commented 4 years ago

Update about the Github Actions logic flow for those who are interested:

Now that the Machinekit-HAL integrates the 'mk-cross-builder' functionality, the building of new images has to be integrated into general Machinekit-HAL CI/CD flow. That means the system has to decide when just download the image from Docker repository and when to build new images. I think that the most sensible solution is to build new packages in moment when the pull request event or push event has commits changing files from which the Docker images are build, so scripts/containers/buildsystem/debian, scripts/buildsystem/debian and some other related scripts. And - of course - when the repository has no builder images in its own Github Packages repository - the idea is to create solution which is as turn-key and isolated as it gets. In other cases - when there is a usable pre-existing image - just download it from Docker repository and be done with it.

There lays the first problem - Github Packages is free solution for Open-Source projects which uses the same login credentials as other Github services, so access from Github Actions is straightforward. But you cannot delete public packages, once it is there, it is there. And to download the package - even public one - you have to log in into the repository, and not just by your username and password, but you have to create token with download package access. (They must be on some serious quality product...) However, it's on one network and for the build-system it is the best solution, I think.

So after deciding if to build or pull, in case of building, there will be jobs which will build the builder images, one per type. The build is no problem. It takes on average up to ten minutes, given that these workers run in parallel, it is 10 minutes. Problem is how to share these Docker images with another workers which will run the testing and package building. One way is to upload them to Docker repository, that works very well. However, these images are not yet proven and you cannot delete them afterwards. So that is no good. Another way is to upload them as an artifact to GitHub storage and from other jobs then download them. And there lies a problem. GitHub says that for Open-Source, there is unlimited storage, in reality 'unlimited' means 5 GB per file and 50 GB in total (from other users testing). I was not able to upload the Docker saved images bigger than around 0.5 GB, much less the 2 GB tars. (It will upload the first five, then error out on all else.) The only way how I was able to get them to upload is by compressing them with xz --best - but that takes on average 3x more time than the building process. But even then it is better to delete the Docker image artifacts afterwards - only thing is you cannot do it from the same workflow, so realistically as a first thing you have to delete all Docker images artifacts from previous workflows. (Again, insert the drug comment.)

Fortunately the downloading and loading the artifact image into Docker in another job is fairly quick (2 minutes tops). Minor issue is, that it sometimes fails. But that may be because I am currently using the v2-preview version of download/upload artifacts function. (The second edition worked better on the bigger files.) Then the tests will run pretty much the same way as they are running now. If it passes, then another job will push these tested Docker images into the Github Packages Docker repository (and if the user specified TOKEN, user and repository, somewhere else.

This takes about an hour of parallel running (maximum is 20 workers). Not great but doable. Another way how to do it is to not build the Docker images beforehand in job on which are other jobs waiting, but build them in each job separately. It will mean that all 14 jobs will build the Docker image and run the tests/build packages. Then these images will be thrown away and another job will rebuild the 11 builder images and push them to Github Packages repository. (Cannot be done in the previous job given that all jobs in given matrix must finish with success.) Problem with this solution is obvious - testing and building will be done with different images and images stored will be different too. Given the time differences there could be upstream changes which would have chance to introduce subtle bugs.

I personally don't know which solution I should use. Maybe I am leaning more towards the one with crazy wait times on xz --best given that rebuilding images will not be something done very often. (Or at least I think it won't be.)

lskillen commented 4 years ago

Sounds like @cloudsmith-io would be a lot easier, plus it supports debian and redhat natively. ;) It's worth remembering that Docker isn't package management, so building the packages first, then containerising that, is usually a much slicker, flexible and more efficient solution; plus it lets native users install natively. If you need any help with that, just let us know (I work for Cloudsmith).

zultron commented 4 years ago

Do you have any idea how much traffic the Machinekit project needs? Packagecloud gives 25 GB of storage and 250 GB traffic (currently the watermark is at about 10 GB of stored packages and the traffic is negligible, as it wasn't really used). Cloudsmith gives 250 GB of storage and 1 TB traffic (they say these limits can be upped based on needs, but hard to say from the get go).

That sounds like more than it used to be. Also be careful: If stored files ever exceed 25 GB, which will happen after many CI builds but old ones are not pruned, they will suspend further uploads until the next billing cycle, even if you go in after the fact and clean up space. I emailed them about that, and they said it wasn't a bug, but a feature.

But in general, I would like for every piece of infrastructure to NOT BE dependent on any specific Machinekit developer or member.

That became my goal after realizing my mistake hosting CI in my shop, but I never quite realized it. The $5/month VPS with Docker images to manage the service was the best I ever figured out, but again, there are new services out there today and you seem pretty on top of it.

One other that could've worked was Suse's OBS. IIRC, since the repos migrated to deb.mk.io, they started supporting ARM-arch Debian builds, something that wasn't there before. It might be possible to build on Travis or other CI and upload results to a repo there.

I have created pull request #269 which eats the Dovetail-Automata/mk-cross-builder repository and closes the #195 issue. It would be nice if you could eyeball it.

That looks like a good way to pull the images in. I originally anticipated it becoming a new repo under the MK GH org, but integrating with the mk-hal repo sounds just as good, without having thought about it too hard.

To make it work, I had to add libglib2.0-dev, libgtk2.0-dev, tcl8.6-dev and tk8.6-dev packages into build dependencies of Machinekit-HAL. I am not completely fine with it, mainly because the hanky-panky which goes on with these packages and the Dockerfile. The container happily builds without these packages and then there are dangling symlinks.

Have you checked that the "hanky panky" is still even necessary? I think there was a problem in older gcc versions where gcc --sysroot=/sysroot still looked in the default /usr/include for headers by default, but later was fixed to look in /sysroot/usr/include. Could be that some of that ugliness isn't necessary anymore.

Ultimately, someday Debian packaging multi-arch support might work properly, and then the need for ALL of that ugliness will go away. I bet that might never happen with TCL/TK, but maybe MK will be TCL/TK-free someday. I also remember cython being one of the blockers. When that's all resolved, cross-building Debian packages will be possible with automatic dependency resolution using standard tools and simple scripts out of Docker. Maybe I'll have to keep smoking my pipe several more years first, though.

zultron commented 4 years ago

I have just downloaded pyzmq (source for python-zmq), yapps2 (source for python-yapps) and pyftpdlib (source for python-pyftpdlib) through pip(3) command. Given that there are official Python alternatives, why these packages have to be installed from APT repository?

Because APT installing machinekit .debs can't pull in those dependencies unless they're also referenced in an APT repository.

This is a packaging problem, not a Python problem. It used to be the same problem with e.g. libjansson and other non-Python sources that weren't packaged in Debian.

(I don't understand Python and never liked it.)

If this issue is turning into an arena for a battle about favorite programming languages, then my gambit is, "I like Python and I'm happy that many other projects I'm involved with, like ROS and RedHat, make extensive use of it."

cerna commented 4 years ago

(...)One other that could've worked was Suse's OBS(...)

I was thinking about getting the ZeroMQ packages from there. But it seems that not all architectures currently supported in Machinekit project are available there. I will add it into to investigate roster. Frankly, the upload can be pretty much anywhere and everywhere.

That looks like a good way to pull the images in. I originally anticipated it becoming a new repo under the MK GH org, but integrating with the mk-hal repo sounds just as good, without having thought about it too hard.

Actually, it was based on your comment. It just made more sense to me to do it this way given that you need to somehow upgrade the images when new Debian package dependencies are added or updated as it is a build input. This way it will be part of normal Machinekit-HAL testing/package build flow.

(...)Have you checked that the "hanky panky" is still even necessary?(...)

No, I discovered that commit 05bad16923fe56402c260e1d50ca8ec13585cc5f needs reverting otherwise build configure doesn't work, but that's all. I didn't want to follow in too many rabbit holed. This is change which can be done gradually in small steps in line with the C4, so I am trying to do it that way instead of big bang. (You would still need the packages I mentioned as that is checked in configure.)

But when it is time to purge Jessie parts, that's the time to look into it more, I think.

(...)cross-building Debian packages will be possible with automatic(...)

There is (was) developer - @Shaeto -in legacy Machinekit/Machinekit repository interested in making Fedora .rpm packages. I have no idea if the logic could be somehow generalized above distro specific parts, but just for sure I put everything into debian/ folders.

Of course, maybe with CMAKE support and CPACK, that will be solved on its own. Or maybe the use of FPM/NFPM or other similar projects should be investigated.

cerna commented 4 years ago

If this issue is turning into an arena for a battle about favorite programming languages, then my gambit is, "I like Python and I'm happy that many other projects I'm involved with, like ROS and RedHat, make extensive use of it."

This issue is about what is specified in the title and first introductory post. I am a programmer, I do solve problems. That being said, solving problems in areas I care about means that I understand the particularities of given territory, and so I am able to produce much better solution, i.e. elegant code. So far, I have been able to come up with following ideas:

Distribute problematic APT packages in special repository
Install the python modules by pip in postinst script
Use Python dh-virtualenv project

There is python-zmq available from the official ZeroMQ Open Build Service account. But no arm builds. The yapps2-runtime is build from yapps2 source, from which yapps2 package is build. Then there is python3-pyftpslib version of python 2 python-pyfptdlib.

But as you do like Python and have much bigger practical experience with it, I will gladly defer it to you. I just want Machinekit-HAL on Bullseye.

In this light, I would postpone the discussion about favourite languages and the associated altercation to later date. Looking through the calendar, my preference is never. How is that working for you? (We will have to consult with the TCL/TK guy yet.)

cerna commented 4 years ago

@lskillen, thank you for piping in. I thought I have seen that avatar somewhere - in the love-open-source example repository.

(...)worth remembering that Docker isn't package management, so building the packages first, then containerising that, is usually a much slicker, flexible and more efficient solution; plus it lets native users install natively(...)

Actually, the Docker images are only used for the .deb package build (and testing). They solve the build and runtime dependencies so the CI Virtual Machine doesn't have to do it on per package basis.

The end result is distributed as a .deb (and in future maybe as a .rpm).

If you need any help with that, just let us know (I work for Cloudsmith).

I will take you on the offer:

(You have Circle-CI Orb already.) Of course, I can use the Cloudsmith-CLI application in Docker, but that additional code and there is probably many projects which could use it.
When I created the Machinekit organization and the Machinekit-HAL repository on @cloudsmith-io, I had to select the "I have enough pull in the project" box. So - as a company - how do you look at distributing packages in this repository which are needed as a runtime dependency, but are not actual part of the project? (Like the pyzmq, xenomai-*, libevl etc - basically ones which do not have official upstream repositories for all architectures and Debian versions, so we have to hack it.)

zultron commented 4 years ago

1. Distribute problematic APT packages in special repository

The MK project has been known to do that in the past. The Jessie repo has (had?) kernel, ZMQ-related and other packages unavailable in the upstream distro. The effort hand-roll a package for a 3rd-party APT repo turns out to an annoying, but often trivial (using existing packaging sources) and one-time pain.

2. Install the python modules by `pip` in `postinst` script

Using pip is fine for folks building from source, but using it in a postinst script defies common practice, and inventing new uses for postinst can be fraught with problems (as we've seen even in MK packaging where package scripts were used to manage symlinks).

3. Use Python [dh-virtualenv](https://github.com/spotify/dh-virtualenv) project

Looks like a pretty cool project for someone (else) to dig into.

There is python-zmq available from the official ZeroMQ Open Build Service account. But no arm builds. The yapps2-runtime is build from yapps2 source, from which yapps2 package is build. Then there is python3-pyftpslib version of python 2 python-pyfptdlib.

These are a potential source of packaging to help with option (1).

But as you do like Python and have much bigger practical experience with it, I will gladly defer it to you. I just want Machinekit-HAL on Bullseye.

Let's go with option (1) and publish .deb dependencies missing from upstream repos in a 3rd-party repo, repackaging from sources that already exist for other distros (Buster, Sid, etc.). This is the easiest option, especially given that we've done it already and it'll be easy to do it again.

In this light, I would postpone the discussion about favourite languages and the associated altercation to later date. Looking through the calendar, my preference is never. How is that working for you? (We will have to consult with the TCL/TK guy yet.)

Perfect! 100% on the same page, calendar page or otherwise.

On the other hand, I can find time on my calendar soon to help build packages for missing deps, a trivial task. If you had a list of exactly which packages are missing from Bullseye, that would save me half the work.

If the Docker CI images are still useful, @ArcEye submitted a PR at Dovetail-Automata/mk-cross-builder#8 to support Bullseye, which I'm feeling very embarrassed about having dropped right now.

I think the non-trivial part of the APT packaging equation is still going to be building the EMC application. If help is needed with that, I'll volunteer once MK-HAL is packaged up and online. Part of my LCNC-EMC port was to work out many of the fundamental build system issues left from the HAL/CNC repo split.

lskillen commented 4 years ago

@cerna Just noticed your reply now. :-)

Actually, the Docker images are only used for the .deb package build (and testing). They solve the build and runtime dependencies so the CI Virtual Machine doesn't have to do it on per package basis.

The end result is distributed as a .deb (and in future maybe as a .rpm).

Fantastic; we have similar techniques for our own internal environments at Cloudsmith.

1. (You have Circle-CI Orb already.) Of course, I can use the Cloudsmith-CLI application in Docker, but that additional code and there is probably many projects which could use it.

Yup! Use it if you can. Any suggestions or code enhancements are also welcome.

2. When I created the Machinekit organization and the Machinekit-HAL repository on @cloudsmith-io, I had to select the "I have enough pull in the project" box. So - as a company - how do you look at distributing packages in this repository which are needed as a runtime dependency, but are not actual part of the project? (Like the `pyzmq`, `xenomai-*`, `libevl` etc - basically ones which do not have official upstream repositories for all architectures and Debian versions, so we have to hack it.)

We're more than happy for you to upload dependencies, assuming that you utilise the repository for distribution of your primary artefacts as well; i.e. it can't be for dependencies only. Other than that, the only real requirement is a (polite) link back to Cloudsmith.

However, you're free to organise your pipeline into multiple repositories; e.g. you can create a repository just for the dependencies, as long as you have another repository for your primary artefacts. That would keep your outputs separate from the artefacts.

In the future we'll help organise this for automatically by labelling dependencies explicitly.

cerna commented 4 years ago

@zultron,

On the other hand, I can find time on my calendar soon to help build packages for missing deps, a trivial task. If you had a list of exactly which packages are missing from Bullseye, that would save me half the work.

Newly installed Debian Bullseye on AMD64 from minimal install CD ISO with Debian Desktop Environment: Cinnamon, Print server and Standard System Utilities
Apt updated and upgraded
Added the Debian Bullseye Machinekit repository: deb http://deb.machinekit.io/debian bullseye main
Installed manually by sudo apt install command: git
Installed manually by sudo apt install --no-install-recommends command: devscripts, equivs
Called manually the mk-build-deps -ir command
Discovered missing packages python-zmq, python-protobuf, python-pyftpdlib, python-yapps
Missing packages removed from control.in file
Installed curl from Debian repository, then followed instruction on how to install pip: curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py and installed by python get-pip.py
Installed by pip packages: protobuf and pyftpdlib
Here ./configure stopped complaining
MAKE errored on "no module named yapps" -> hal/user_comps/Submakefile:31 ../bin/adx1345 Error 1 -> File ../bin/comp, line 24 from yapps import runtime
Installed by pip packages: yapps2
Make finished successfully and runtests finished successfully ==> Package python-zmq is completely useless in Build-depends and probably should be removed.

Cured Dockerfile to not include the aforementioned packages (BTW, do these have to be specified in Dockerfile separately? It is in section Install Multi-Arch: foreign dependencies) and to include the pip tool and the aforementioned packages
Cured build script to use Podman instead of Docker as Docker has problem with unix sockets on five-ish kernels
Build the builder image for AMD64_11 tag
Cured packaging script to use Podman instead of Docker
Build the Machinekit-HAL packages
When trying to install the first machinekit-hal_0.3* .deb file, the APT errored on non-installable packages: python-glade2, python-gtkgkxt1, python-zmq, python-protobuf, python-pyftpdlib, python-pydot and python-gst-1.0 or python-gst0.10
Removed offending packages from Dependencies
Rebuild packages
Installed packages machinekit-hal and machinekit-hal-posix without problems
halrun command create instance and I can send other commands to it: Tested newinst, newthread, addf, start

I think all of these issues are connected to Python 2 End-Of-Life in January 2020. Because there are python3-* packages for Bullseye in official Debian repository. But these seem to be unusable for current Machinekit state.

cerna commented 4 years ago

@lskillen, fantastic, thank you for the answer.

Turns out I am a moron, I read what I have written, and :man_facepalming:. What I actually meant to ask:

Do you plan to introduce official Github Actions action? (You have Circle-CI Orb already.) Of course, I can use the Cloudsmith-CLI application in Docker, but that additional code and there is probably many projects which could use it.

In other words, I left out the most important part of the question. :man_facepalming:

Other than that, the only real requirement is a (polite) link back to Cloudsmith.

Sure, that's given.

However, you're free to organise your pipeline into multiple repositories; e.g. you can create a repository just for the dependencies, as long as you have another repository for your primary artefacts. That would keep your outputs separate from the artefacts.

So if I created new repository Machinekit-dependencies with all dependencies for all Machinekit projects, nnd then have repository Machinekit-HAL where I will push build artifacts from this repository, and then have repository Machinekit-CNC where I will push build artifacts from Machinekit-CNC, then that will be an OK way how to do it? Nice!

cerna commented 4 years ago

If the Docker CI images are still useful, @ArcEye submitted a PR at Dovetail-Automata/mk-cross-builder#8 to support Bullseye, which I'm feeling very embarrassed about having dropped right now.

Yes, Docker CI images are useful. I am not going to fundamentally change something which is currently working and I don't have that deep knowledge about it to feel comfortable to do deep cuts. I might do some hacking. I actually picked it up in machinekit/machinekit-hal#270 and it is now part of Machinekit-HAL proper. The comment I described how I build the package use this work.

I think the non-trivial part of the APT packaging equation is still going to be building the EMC application. If help is needed with that, I'll volunteer once MK-HAL is packaged up and online. Part of my LCNC-EMC port was to work out many of the fundamental build system issues left from the HAL/CNC repo split.

I haven't looked into it yet. I somehow hoped that there will be CMAKE based build flow first, but that's probably not going to be the case.

But I will get the Machinekit-HAL packaging up and running with publishing to @cloudsmith-io first and foremost. Taking the -CNC part into consideration will only slow everything down to nothing-gets-finished level.

lskillen commented 4 years ago

@cerna

Do you plan to introduce official Github Actions action? (You have Circle-CI Orb already.) Of course, I can use the Cloudsmith-CLI application in Docker, but that additional code and there is probably many projects which could use it.

We have one that's been forked from a user of ours (who's OK with us taking ownership): https://github.com/cloudsmith-io/action

It's incredibly spartan at the moment, but we'll almost certainly be tidying this up and publishing it to the GitHub marketplace as well. PRs welcome!

In other words, I left out the most important part of the question. 🤦‍♂

I did wonder!

So if I created new repository Machinekit-dependencies with all dependencies for all Machinekit projects, nnd then have repository Machinekit-HAL where I will push build artifacts from this repository, and then have repository Machinekit-CNC where I will push build artifacts from Machinekit-CNC, then that will be an OK way how to do it? Nice!

Yes, yes and yes. :-)

cerna commented 4 years ago

I have looked at the Debian packages produced from the original Jenkins builder and discovered that these were signed by the dpkg-sig tool with D030445104BADB8A5FC9544FF81BD2B7499BE968 sub-key from Machinekit Signer.

mk@mk:~/Downloads/mktemp$ dpkg-sig --list machinekit-hal-posix-dbgsym_0.2.1561737052.git8ad6145-1~stretch_armhf.deb
Processing machinekit-hal-posix-dbgsym_0.2.1561737052.git8ad6145-1~stretch_armhf.deb...
builder
mk@mk:~/Downloads/mktemp$ dpkg-sig --verify machinekit-hal-posix-dbgsym_0.2.1561737052.git8ad6145-1~stretch_armhf.deb
Processing machinekit-hal-posix-dbgsym_0.2.1561737052.git8ad6145-1~stretch_armhf.deb...
GOODSIG _gpgbuilder D030445104BADB8A5FC9544FF81BD2B7499BE968 1561738832

And it got me thinking that this is probably a very good idea - to help differentiate that the package was built from official Machinekit/Machinekit-HAL code-base and not from some fork. (Or the opposite, that the package is from specific fork.)

But does anybody (@luminize, @zultron, @cdsteinkuehler) have the original Machinekit Signer primary key with which a new sub key could be created for this job?

cerna commented 4 years ago

I have been playing with multistrap for #273. There is a libck-dev with dependencies for Debian Buster now. However, it is not in the main distro repository, but in the buster-backports repository.

I don't think that should be a problem, but it is. As I cannot get multistrap to satisfy dependencies for machinekit-hal-build-deps from primarily the main repository and only packages which are not included - i.e. libck-dev and libck0 - to be taken from buster-backports. Multistrap just installs the newest packages available. And that causes problem because standard apt installs with priority resolution with first trying the normal repository (which I think is the right course of action).

This fails on protobuf-compiler in the pipeline. (Headers are created with different version.)

If anybody knows how to solve this, I am all ears.

zultron commented 4 years ago

I started taking a look at the Bullseye deps. I suspect it'll take a full python3 upgrade to support Bullseye, and that will probably take some real rework. Some issues I already find:

No python3-imaging-tk, python3-glade2, python3-gtkglext1 packages
No python3 yapps for Jessie (could be solved by backporting the package)

I started a branch, but only got as far as fixing the Docker image build, protobuf python bindings and comp generator before stopping. (This certainly deserves a new issue to further discuss python3 porting.)

cerna commented 4 years ago

(...)and that will probably take some real rework(...)

Well, Python2 was EOL for few months and the situation will be worsening with packages going missing left and right, so it is probably high time to make the switch. But I just hope (given that the discussion about Python 3 is pretty old: #114 and machinekit/machinekit#563) it will not turn into rabbit hole and take both Bullseye and Python 3 projects with it to never land. (As I will not be very useful for this endeavour.)

No python3-imaging-tk, python3-glade2, python3-gtkglext1 packages

Looking at python-imaging-tk, it was meant as a transitional package. I guess the transition period ended, huh. To be frank, I am not terribly happy about Machinekit-HAL requiring UI packages, I always thought about it as a headless suite which should not even have X11/Wayland support.

No python3 yapps for Jessie (could be solved by backporting the package)

Jessie will be slashing LTS in two months. I am not sure it is still worth the effort (given that this will take few weeks) to have it supported for 14 days before I turn the builder off. I am just saying. Jessie had a good run, but it's time to let it go.

I started a branch, but only got as far as fixing the Docker image build, protobuf python bindings and comp generator before stopping.

LinuxCNC's @rene-dev recently started Python3 work (at least he said so on IRC #linuxcnc-devel). Maybe it's worth to have a looksie into his work and port parts which are still the same in both LinuxCNC and Machinekit-HAL?

zultron commented 4 years ago

(...)and that will probably take some real rework(...)

Well, Python2 was EOL for few months and the situation will be worsening with packages going missing left and right, so it is probably high time to make the switch. But I just hope (given that the discussion about Python 3 is pretty old: #114 and machinekit/machinekit#563) it will not turn into rabbit hole and take both Bullseye and Python 3 projects with it to never land. (As I will not be very useful for this endeavour.)

Bullseye is going to be a real endeavor to bring up because of the python3 issue. As I said offline, I'd like to coordinate with the LCNC folks over this, since any changes to support python3 on the HAL side need to be mirrored on the EMC side, and going forward, building LCNC EMC against MK HAL is still the most sustainable plan for making packages available for MK HAL and its top application.

Looking at python-imaging-tk, it was meant as a transitional package. I guess the transition period ended, huh. To be frank, I am not terribly happy about Machinekit-HAL requiring UI packages, I always thought about it as a headless suite which should not even have X11/Wayland support.

I'm definitely in favor of factoring out anything TCL/TK, but I do think tools like halscope belong in HAL and are invaluable. Where do you draw the line, or what would you like to see in an ideal world?

Jessie will be slashing LTS in two months. I am not sure it is still worth the effort (given that this will take few weeks) to have it supported for 14 days before I turn the builder off. I am just saying. Jessie had a good run, but it's time to let it go.

No disagreement from here. I'd love to jettison some of the ugly stuff we have in the packaging and CI configuration needed to support Jessie. Last I heard, though, @dkhughes has a lot of BBBs out in the field still running Jessie, which is why instead of ditching it back then, we replaced the unmaintained Emdebian tool chain with Linaro.

LinuxCNC's @rene-dev recently started Python3 work (at least he said so on IRC #linuxcnc-devel). Maybe it's worth to have a looksie into his work and port parts which are still the same in both LinuxCNC and Machinekit-HAL?

LinuxCNC/linuxcnc#403 pretty much lays out what's already been done. @gibsonc appears to have done the heavy lifting porting the C-language bindings I started bumping up against a few hours into my naive attempt referenced above (like halmodule.cc). That would be pretty easy to pull over.

Still, I'm most interested in getting packages for released distros online first. If you'd like to point me to where the project is with that and how I can help, I'll spend a few days hammering on it.

cerna commented 4 years ago

I'm definitely in favour of factoring out anything TCL/TK, but I do think tools like halscope belong in HAL and are invaluable. Where do you draw the line, or what would you like to see in an ideal world?

And I am up there with you. However, I think it should be in separate repository/separate packages and solved as the original idea dictates. (You will find, many of my ideas are pretty in line with the original Haberler's ones.) Ring buffer from real-time side to the transshipment point, where the ring buffer frame will be sent on ZeroMQ socket to display application. That way it can run on the same machine as Machinekit-HAL or on notebook next to the device (like service guys are wont to do).

I have been thinking about doing something like it based on WPF/Noesis GUI (first for HAL meter) but it is on low burner so far for me. (More important things need to be done.)

No disagreement from here. I'd love to jettison some of the ugly stuff we have in the packaging and CI configuration needed to support Jessie. Last I heard, though, @dkhughes has a lot of BBBs out in the field still running Jessie, which is why instead of ditching it back then, we replaced the unmaintained Emdebian tool chain with Linaro.

That was a year ago, no? I am hoping since then @dkhughes twisted his customer's hands and is now mostly running on distribution which will be supported little longer. Otherwise, they will not be able to upgrade, of course if he won't support it on the side (or just don't upgrade Machinekit installation, it will be the same thing [pretty much] without security upgrades).

Still, I'm most interested in getting packages for released distros online first. If you'd like to point me to where the project is with that and how I can help, I'll spend a few days hammering on it.

I think it is mostly done. I just need to decide which packages to upload to Cloudsmith. After @luminize merges the #274 which implements the Docker image caching and auto-build, I will just redo some less important parts. Like transform the bash build commands into other scripting language to allow for hierarchical composing - so the same code will be reusable in more streamlined way.

What I would like to see is the #246, #250 and #200 done - even if the Machinekit-CNC will lag behind. I know @kinsamanka wanted to first wait on it, but it dragged on too long. It shouldn't be such problem with Machinekit-CNC package building and it has to be done. Better for it to hurt little bit from start than wait another year for it.

So if you could take look on it, I would be glad. (Not only it would mean better IDE support for Machinekit-HAL, but it would be great for other purposes too [I am in mostly for the IDE support :smiley:]).

cerna commented 4 years ago

After merging #274 and #275 pull requests, Machinekit-HAL now has automatic build and rebuild of missing or changed Debian builder Docker images. These can be downloaded from packages in each repository. For Machinekit/Machinekit-HAL it is this one. Older builds are being archived in Machinekit QUAY registry. I also changed name from mk-cross-builder:$tag where $tag represented version of the builder to machinekit-hal-debian-builder-v.$version:$tag where $version is the same what was $tag and $tag is now :latest in case of Github Registry and git long sha in case of QUAY.

There are also new labels in the Docker images. It should help with identifying from which fork the image hails. Also, in the script checking existence of pre-built images in Github Actions workflow, it should check that all images have the same io.machinekit.machinekit-hal.vcs-ref and that this sha is in current git history, i.e. that the image is not based on some other git branch. But that's for later when Debian package repository is up.

Of course there were two errors when merged in Machinekit-HAL proper even though everything was working flawlessly in my repositories. (There would be something seriously wrong with the world if everything worked perfectly.) The first error was Debian one: W: Failed to fetch http://security.debian.org/debian-security/dists/jessie/updates/main/binary-armhf/Packages Hash Sum mismatch which is known one and not something I can solve. (It's something like the Machinekit's threads.0 issue.) The second one is blob upload unknown from Docker image upload to Github Packages - this is also known one and also not something I can solve. The QUAY is without problem and probably better service. (Too bad Microsoft didn't buy them.) But it requires a key so it's not a turn-key solution which is something I really want for this Github Actions workflow to be.

(I am going to transform the QUAY registry when I implement the secondary CI/CD flows on CircleCI/CirrusCI/DroneCI away from Github Actions.)

Now I need to get the signing of packages working. (I will generate new keys, cannot be helped, I don't have the old ones.) And implement the uploading job for Cloudsmith. (I am currently testing if it is better solved by matrix job or just one which will download all artifacts, filter out only the unique ones [if workflow fails and you restart it, there will be artifacts from the failed job mixed in] and upload to Cloudsmith by python CLI.)

I also need to regenerate images in eryaf DockerHUB account on which are connected Travis flow and the older Github Actions one to include libcmocka-dev, otherwise after adding it to build dependencies the building process will fail.

lskillen commented 4 years ago

@cerna If you need it, @cloudsmith-io fully supports Docker images too. :-)

cerna commented 4 years ago

@lskillen, These images have about 800 MB per piece. It is used mostly for the Github Actions runners, which are hosted in East US 2 Azure region. So I don't want to generate unnecessary traffic.

However, given recent Github outages, I will definitely keep it in mind.

Other than that, the only real requirement is a (polite) link back to Cloudsmith.

I have also started uploading Machinekit-HAL packages to Cloudsmith. In README I added two icons with Cloudsmith's logo pointing to the two repositories and in section Getting started I mentioned @cloudsmith-io directly. Is it OK?

BTW, didn't you have yesterday some strange issue with webhooks? I was seeing some random data and was thinking I will have to ask you about it.

cerna commented 4 years ago

OK, signing package with dpkg-sig tool is implemented. The new key which is stored in Machinekit/Machinekit-HAL secrets storage is 4A374E9D7CA79FA717293B98D2EFAE426CDDB0FE. It would be great if each fork distributing packages created its own key - that way it would be functioning as a label to better differentiate between sources.

Also, upload to Machinekit/Machinekit-HAL Cloudsmith registry is implemented. Little problematic is that it takes about 30 minutes of runtime to upload all images, but that is topic for another time.

What is troubling is number of outages of Github API (I never noticed it until I started working with Github Actions). Two to three days ago, the upload to Github Packages (socker.pkg.github.com) failed on unknown blob error. With resolving of latest Github outage it worked fine again, however today it is not working. I already sent feedback to Github and opened thread on Github Community so I am hoping it will be solved fast. But still, it represent serious problem and dent in my turn-key target.

Hmm, and now even network starts to fail:

Err http://ftp.debian.org jessie-updates Release.gpg
  Unable to connect to ftp.debian.org:http:
Ign http://ftp.debian.org jessie Release
Ign http://ftp.debian.org jessie-updates Release
Err http://ftp.debian.org jessie/main amd64 Packages
  Unable to connect to ftp.debian.org:http:
Err http://ftp.debian.org jessie/main i386 Packages
  Unable to connect to ftp.debian.org:http:
Err http://ftp.debian.org jessie-updates/main amd64 Packages
  Unable to connect to ftp.debian.org:http:
Err http://ftp.debian.org jessie-updates/main i386 Packages
  Unable to connect to ftp.debian.org:http:
Fetched 1968 kB in 31s (62.4 kB/s)
W: Failed to fetch http://ftp.debian.org/debian/dists/jessie/Release.gpg  Unable to connect to ftp.debian.org:http:

W: Failed to fetch http://ftp.debian.org/debian/dists/jessie-updates/Release.gpg  Unable to connect to ftp.debian.org:http:

cerna commented 4 years ago

So after X-th rerun of the failed workflow, now 19.47 GMT+0 everything ran correctly. If I had to bet, I would say that GitHub has a problem with load-balancing its API.

(Fortunately rebuild of Docker builder images should not happen on every push and pulling was working fine.)

lskillen commented 4 years ago

@cerna

These images have about 800 MB per piece. It is used mostly for the Github Actions runners, which are hosted in East US 2 Azure region. So I don't want to generate unnecessary traffic.

However, given recent Github outages, I will definitely keep it in mind.

Makes sense. :-)

Other than that, the only real requirement is a (polite) link back to Cloudsmith.

I have also started uploading Machinekit-HAL packages to Cloudsmith. In README I added two icons with Cloudsmith's logo pointing to the two repositories and in section Getting started I mentioned @cloudsmith-io directly. Is it OK?

Sounds good to me!

BTW, didn't you have yesterday some strange issue with webhooks? I was seeing some random data and was thinking I will have to ask you about it.

Yep. It was sample data. There was a bug with feature flags which meant everyone was shown example data for a short period of time, but it didn't impact the actual functionality under the hood. Nothing to see here, move along! 😁

cerna commented 4 years ago

@zultron, I wanted to talk to you about the part of the original Dockerfile which was creating a Travis user:

# Set up user ID inside container to match your ID
ENV USER=travis
ENV UID=1000
ENV GID=1000
ENV HOME=/home/${USER}
ENV SHELL=/bin/bash
ENV PATH=/usr/lib/ccache:/opt/gcc-linaro-hf/bin:/usr/sbin:/usr/bin:/sbin:/bin
RUN echo "${USER}:x:${UID}:${GID}::${HOME}:${SHELL}" >> /etc/passwd
RUN echo "${USER}:*:17967:0:99999:7:::" >> /etc/shadow
RUN echo "${USER}:x:${GID}:" >> /etc/group

I have originally [removed]() it as I had a feeling that it is burning specific travis user into Docker image at build-time and originally I had in mind to create a Github Actions Docker action, which should run as a root. (That turned out to be no-go option.) However, the regression tests seems to be happy running with just --user flag passed to the docker run command.

But to prevent causing problems to users running these Docker images locally, I have been thinking if I should not somehow connect the runner user with the in-container user. So I have been looking at fixuid and user namespaces.

But, do you think it is still issue? Respective, do you have preference for solution?

zultron commented 4 years ago

For now, I'm dealing with it like this:

https://github.com/zultron/machinekit-hal/commit/46baf131426ae3b26e7197392c0716c47231a2dc

Its shortcoming is the UID is set at image build time, so somebody pulling an image created in CI will encounter annoyance when bind-mounting and building a local MK source tree.

In another project, I have an entrypoint script that does what fixuid does. I'd prefer to do something like that; fixuid if it's easy and lightweight, or else I can pull over that entrypoint script.

cerna commented 4 years ago

Its shortcoming is the UID is set at image build time, so somebody pulling an image created in CI will encounter annoyance when bind-mounting and building a local MK source tree.

Yeah, I am that somebody. (I am using multiple accounts on my test machines.)

The build_docker bash script also passes USER variable with current user running the docker run and mount HOME (which is something I am not comfortable with). Having echo "$USER" report something different to whoami is misleading, I think. So, I have been looking at fixuid code (and I am pretty lean on Go knowledge) if it is able to set the username somehow, but I don't think it is. (I am also foggy on the need for mounted $HOME and exported $USER).

I will try with fixuid and if it will be causing problems, your script can be tried for a spin.

cerna commented 4 years ago

Fixuid does not change the username reported by whoami.

Another viable option is to use [nss_wrapper](https://cwrap.org/nss_wrapper.html) which basically uses LD_PRELOAD to catch all call to functions querying user data.

Given that nothing beyond line RUN echo "ALL ALL=(ALL:ALL) NOPASSWD: ALL" >> /etc/sudoers is that useful, this could work.

cerna commented 4 years ago

Oh, nice, I discovered another bug in Github Actions. This one quite severe, I think.

I described it in detail on Github Community.

So here just in short: I stored in Machinekit/Machinekit-HAL secret with name of the QUAY registry. Everybody can tell what it is - obviously machinekit. But given every fork can have its own 3rd party registry (my own has one too), it cannot be part of the repository, otherwise it would clash with fork+work+pull flow. However, if secret is contained in job output, it does not fail on this step, but actually outputs an empty string.

So, if I wasn't watching the workflow run and didn't stop it, it would build and push new Docker image like docker.pkg.github.com/machinekit/Machinekit/i386_10:latest thus creating a new packages in Github Packages which then could not be removed. Pretty bad from my viewpoint.

Because of this I deleted the secrets pertaining to QUAY repository upload. (No more Debian builder Docker images history there.)

zultron commented 4 years ago

Fixuid does not change the username reported by whoami.

Another viable option is to use [nss_wrapper](https://cwrap.org/nss_wrapper.html) which basically uses LD_PRELOAD to catch all call to functions querying user data.

Given that nothing beyond line RUN echo "ALL ALL=(ALL:ALL) NOPASSWD: ALL" >> /etc/sudoers is that useful, this could work.

OK. Or I can port over my entrypoint script, which would be pretty easy. Whatever you prefer.

cerna commented 4 years ago

OK. Or I can port over my entrypoint script, which would be pretty easy. Whatever you prefer.

I have put the Fixuid in my latest pull request #278 - if it is going to cause you any problems or you don't like it or whatever, just change it.

zultron commented 4 years ago

As @cerna found in one of my recent PRs, I managed to break packages again so that -hal and -hal-dev can't be coinstalled.

One of the changes I made enables running tests against installed packages. Once the LCNC compatibility PR is merged, we should update CI to exercise an LCNC build. One part of that could be to install and build against MK packages, which will guard against this problem.

Otherwise, just add a step to CI to explicitly install MK packages, and optionally run tests against those, too.

cerna commented 4 years ago

Looking through the Machinekit organization, I discovered that there are two repositories pertaining the buildsystem: machinekit-NG and mk-builder - given that one of my goals to make Machinekit as clear and obvious (i.e. not confusing) as humanly possible, I would like to know if these systems are still useful (in other words, if somebody is still using them for something) - and if not, then I vote to either delete them or update the READMEs and archive them. (The same way I archived the machinekit/Machinekit repository.)

Otherwise, it will start/continue to confuse newcomers with the whole mk-builder/machinekit-NG/mk-cross-builder/machinekit-hal menagerie of (Docker) buildsystems.

cerna commented 4 years ago

Just to be complete, in my surfings I discovered that native building on ARM offer other free CI/CD services beside Drone Cloud (armhf and arm64 based on Docker), namely Travis CI (arm64 only based on LXC) and Shippable (armhf and arm64) and Codefresh (unknown version of ARM).

Also, I wanted to pose the question about supporting some git hook manager - for example Lefthook (but the actual tool is not that important). The problem it should solve is this: now Machinekit-HAL has a pre-commit for formatting the C/C++ files using clang-format - problem is, it doesn't really work, because clang-format is very picky about version of its configuration file. I have installed version 11 and that is incompatible with the config file in Machinekit-HAL. So I wasn't really using it. One way how to solve this issue is - wait for it - to use specific version of clang-format in Docker container. The same for other formatters - for python (not sure what), for bash (shfmt) and for whatever else. Another use-case for this is when one wants to use IDL (like the JSONNET), but still have to commit the generated output to repository (like the YAML) - this could be autogenerated and committed in one commit. The CI on the server side would then check that both files were changed in one commit and that the output is actually generated from input.

Problem is: it will need for every developer to install both the git hooks manager and Docker (or Podman or other containering technology which can use the OCI format or Dockerfiles) and actually use it. (I think it is a good time to implement something like this when I am currently wearing devOps hat.)

So, any opinions?

zultron commented 4 years ago

Looking through the Machinekit organization, I discovered that there are two repositories pertaining the buildsystem: machinekit-NG and mk-builder - given that one of my goals to make Machinekit as clear and obvious (i.e. not confusing) as humanly possible, I would like to know if these systems are still useful (in other words, if somebody is still using them for something) - and if not, then I vote to either delete them or update the READMEs and archive them. (The same way I archived the machinekit/Machinekit repository.)

mk-builder was an early incarnation of mk-cross-builder, which is now pulled into machinekit-hal, of course. It can be safely deleted.

I don't know what machinekit-NG is all about.

zultron commented 4 years ago

Also, I wanted to pose the question about supporting some git hook manager - for example Lefthook (but the actual tool is not that important). The problem it should solve is this: now Machinekit-HAL has a pre-commit for formatting the C/C++ files using clang-format - problem is, it doesn't really work, because clang-format is very picky about version of its configuration file. I have installed version 11 and that is incompatible with the config file in Machinekit-HAL. So I wasn't really using it. One way how to solve this issue is - wait for it - to use specific version of clang-format in Docker container. The same for other formatters - for python (not sure what), for bash (shfmt) and for whatever else. Another use-case for this is when one wants to use IDL (like the JSONNET), but still have to commit the generated output to repository (like the YAML) - this could be autogenerated and committed in one commit. The CI on the server side would then check that both files were changed in one commit and that the output is actually generated from input.

Problem is: it will need for every developer to install both the git hooks manager and Docker (or Podman or other containering technology which can use the OCI format or Dockerfiles) and actually use it. (I think it is a good time to implement something like this when I am currently wearing devOps hat.)

@machinekoder set up something like this (on steroids!), pre-commit, for another project @luminize and I are involved in. I absolutely LOVE having formatters automatically fix indentation, trailing white space and wrapping lines to within 80 characters. It handles C, C++, Python (incl. flake8), Bash, CMake, XML and more.

On the other hand, forcing all developers to install the git hooks is quite onerous. In that project, the hooks run in a Docker container to solve the matching versions issue and also because some of the tools are pretty hard to install. So that means developers need Docker, yet another hurdle.

I wonder if there's a way to get CI to run the formatters on a PR and add a commit with the results? That would make things a whole lot easier, but I don't know if or how it would work. A human user could run the formatters and submit a second PR against the first PR's branch. Or could periodically run formatters on latest master and submit a PR with any changes. I wonder if that's possible to automate without a huge learning curve?

cerna commented 4 years ago

mk-builder was an early incarnation of mk-cross-builder, which is now pulled into machinekit-hal, of course. It can be safely deleted. I don't know what machinekit-NG is all about.

So, delete the mk-builder repository and set the machinekit-NG repository as archived until somebody says otherwise? On the other hand, it looks like an old version of machinekit/machinekit repository (with last commit being done by Michael Haberler [which in itself sets its age] over four years ago), so I am not sure if it has any merit.

@luminize, do you have any leaning on this?

@machinekoder set up something like this (on steroids!), pre-commit, for another project @luminize and I are involved in. I absolutely LOVE having formatters automatically fix indentation, trailing white space and wrapping lines to within 80 characters. It handles C, C++, Python (incl. flake8), Bash, CMake, XML and more.

You mean pre-commit, the project or pre-commit, the actual git hook?

On the other hand, forcing all developers to install the git hooks is quite onerous. In that project, the hooks run in a Docker container to solve the matching versions issue and also because some of the tools are pretty hard to install. So that means developers need Docker, yet another hurdle.

That's my worry about this too. Running the action in Docker has its own magic - it doesn't pollute the developer system installation with stuff he doesn't otherwise use, you can use it in CI system (well, not Drone, but Drone has its issues), you can share the functionality scripts across repositories with storing the actual Dockerfiles/Entrypoint scripts in another one. But the developer has to have a Docker service installed and running.

But thinking about it, I realized that Machinekit as a project is going in the direction of Docker-centered development (just not Docker-deployement, which is typical for containers). So maybe it's not so out-of-there idea to actually require containering technology.

Then again, I would really care only about few critical files/changes. Like if the CI service is controlled by YAML file which is generated from some IDL file. Then I would check if both were changed in one commit, if they are functionally equivalent and so. But if the developer used some git hook manager connected to Docker container run or has written both files by hand from memory, that I don't care.

And about formatting, I would like to gradually get some ugliness out of the source-tree. I am frequently using hover over code line, get last commit message functionality. That would disappear with repository-wide reformatting. When doing gradual change, then I will get the commit message of last change to the file (not line, the file), which is slightly better.

I wonder if there's a way to get CI to run the formatters on a PR and add a commit with the results? That would make things a whole lot easier, but I don't know if or how it would work. A human user could run the formatters and submit a second PR against the first PR's branch. Or could periodically run formatters on latest master and submit a PR with any changes. I wonder if that's possible to automate without a huge learning curve?

You could. Sure. But you would need to go through each commit and amend it, which would cause history rewrite to the user. (No fast-forward for him.) I investigated it year or so back and consensus on the internet was to not do it. And periodic reformatting has the same problem as repo-wide action, it's actually worse, because it hurts permanently and not one off.

About automation, the theory is simple. The problems are the gotchas you make and general error originating from forgetfulness when implementing it.

Basically, the way I see it: The git hooks are a local CI pipeline.

zultron commented 4 years ago

@machinekoder set up something like this (on steroids!), pre-commit, for another project @luminize and I are involved in. I absolutely LOVE having formatters automatically fix indentation, trailing white space and wrapping lines to within 80 characters. It handles C, C++, Python (incl. flake8), Bash, CMake, XML and more.

You mean pre-commit, the project or pre-commit, the actual git hook?

The project, sorry.

But thinking about it, I realized that Machinekit as a project is going in the direction of Docker-centered development (just not Docker-deployement, which is typical for containers). So maybe it's not so out-of-there idea to actually require containering technology.

While Docker-centered development has significantly changed my life for the better, I have mixed feelings about making it a requirement for a project that hopes to attract more developers. Docker's huge advantage is it containerizes the hell of installing all the dependencies on a developer machine, regardless of the host OS. I have also found running Machinekit out of a container makes life very easy; for example, when running ROS in an Ubuntu container, but Debian with off-the-shelf RT_PREEMPT kernel on the host; this maximizes the ability to use pre-packaged software (and was actually my #1 reason for complaining about MK-HAL flavor packages not being co-installable: the same container image can run on any host OS, no matter what RT threads environment is available). On the other hand, it does require a learning curve to understand things like why changes disappear after container restart and how to build them into a new image; how to adapt the development workflow, editing files outside the container while running build and test tools inside; etc. etc. A lot of people immediately reject moving to Docker without even understanding what it is; I guess they (rightly) anticipate a learning curve and don't welcome the prospect.

Then again, I would really care only about few critical files/changes. Like if the CI service is controlled by YAML file which is generated from some IDL file. Then I would check if both were changed in one commit, if they are functionally equivalent and so. But if the developer used some git hook manager connected to Docker container run or has written both files by hand from memory, that I don't care.

I don't understand this; do we generate YAML from IDL files today?

And about formatting, I would like to gradually get some ugliness out of the source-tree. I am frequently using hover over code line, get last commit message functionality. That would disappear with repository-wide reformatting. When doing gradual change, then I will get the commit message of last change to the file (not line, the file), which is slightly better.

I'd like to preserve git blame ability without having to dig as well. But formatters operating at the file level still destroys that after the first change of even a single line. I think the choice is between using formatters at all on the one hand and preserving top-level blameability on the other.

I wonder if there's a way to get CI to run the formatters on a PR and add a commit with the results? That would make things a whole lot easier, but I don't know if or how it would work. A human user could run the formatters and submit a second PR against the first PR's branch. Or could periodically run formatters on latest master and submit a PR with any changes. I wonder if that's possible to automate without a huge learning curve?

You could. Sure. But you would need to go through each commit and amend it, which would cause history rewrite to the user. (No fast-forward for him.)

No way! Just add an additional commit on top fixing the problem formatting.

I investigated it year or so back and consensus on the internet was to not do it. And periodic reformatting has the same problem as repo-wide action, it's actually worse, because it hurts permanently and not one off.

Hurts what? Blameability? Or something else?

About automation, the theory is simple. The problems are the gotchas you make and general error originating from forgetfulness when implementing it.

I agree. It took a long time to work out all the kinks in the other project I mentioned above.

In any case, I'm pretty much willing to go either direction. Fixing the formatting and requiring Docker for all but the most trivial contributions would be two welcome changes for me personally, but at the same time I've seen how for others those can cause problems that I can't solve, and not for lack of trying.

luminize commented 4 years ago

So, delete the mk-builder repository and set the machinekit-NG repository as archived until somebody says otherwise? On the other hand, it looks like an old version of machinekit/machinekit repository (with last commit being done by Michael Haberler [which in itself sets its age] over four years ago), so I am not sure if it has any merit.

@luminize, do you have any leaning on this?

@cerna I think this has been a historical effort/start for some kind of cmake project for building. It says in the repo description "Machinekit NG - development repo for new build system. DO NOT USE except for development." But since it's stale and nobody has been working on this for 4 yrs, I think it's wise to delete it.

cerna commented 4 years ago

While Docker-centered development has significantly changed my life for the better, I have mixed feelings about making it a requirement for a project that hopes to attract more developers.

Definitely no requirement, only recommendation with the note that it makes life (and contributing) easier. Personally, I think that with simple step-by-step tutorial how to compile Machinekit-HAL's packages using (scripts using) Docker, people would not have that much of a problem with it.

On the other hand, it does require a learning curve to understand things like why changes disappear after container restart and how to build them into a new image; how to adapt the development workflow, editing files outside the container while running build and test tools inside; etc. etc.

Sure, but by today standards, it is becoming quickly required base knowledge. So this is going to be a lesser problem, I think. But still, this is a reason why I would not connect usage of this system with mergeability of commits. Just a tool for those interested.

I guess they (rightly) anticipate a learning curve and don't welcome the prospect.

That is logical. But the industry will sweep them kicking and screaming.

I don't understand this; do we generate YAML from IDL files today?

No, not today. However, it looks like Drone will not allow me to create dynamic runner matrixes on the fly, other CI services don't support it too. (Turns out it is quite less common than I though.) And I really would like the one-point-of-change settings. (I was annoyed when I was tracking down why after changing version number in VERSION file in the root of this repository the packages were still produced with the old version number.) Given that I will not get that, I want the next best thing - automatically error out if file should have been changed but wasn't. And, so I don't have to change the same information in X different files, pre-commit git hook would do it for me.

But at this point it's just talk. I am not saying that it has to be that way or that I need it.

I'd like to preserve git blame ability without having to dig as well. But formatters operating at the file level still destroys that after the first change of even a single line. I think the choice is between using formatters at all on the one hand and preserving top-level blameability on the other.

True. There is a level of wizardry when you can format just the changed lines in file and leave the rest be. I don't know how to implement it with common tools from top of my head - it would need tailored scripts for sure. But it is possible.

I think that for me, the formatted code has higher priority than git blame.

No way! Just add an additional commit on top fixing the problem formatting.

There are even premade bots - how it works. Or just create that commit from Github Actions. But I don't know, it kinda still doesn't feel right to just add code to contributor pull request (and it would need to be added, because you have to run tests on it and to avoid uncertainty, you cannot add it once in branch against which you test and then second time when merge push is done).

There is also Pronto - I haven't tried it yet, but it should create a bunch of code reviews, which should butt somebody that he should use the formatter.

Hurts what? Blameability? Or something else?

Yes, blameability. On the other hand, is somebody would run the formatter on the whole repository and created pull request with it, then tests would be (probably) green and by C4 it would be merged. So maybe this is a moot point. (Just human would do it.)

In any case, I'm pretty much willing to go either direction. Fixing the formatting and requiring Docker for all but the most trivial contributions would be two welcome changes for me personally, but at the same time I've seen how for others those can cause problems that I can't solve, and not for lack of trying.

I would fix formatting and not make Docker required. But make its use simple for those who want to.

cerna commented 4 years ago

@cerna I think this has been a historical effort/start for some kind of cmake project for building. It says in the repo description "Machinekit NG - development repo for new build system. DO NOT USE except for development." But since it's stale and nobody has been working on this for 4 yrs, I think it's wise to delete it.

Thanks. I created Issues and pushed commit to both repositories informing all interested parties that I intend to delete them both.

cerna commented 4 years ago

Trying to integrate the native Docker image builder which I could then use for Drone Cloud CI service into the current Dockerfile, I discovered that it would be better to wait until Jessie LTS is dead (it is less than month now).

So I started rewrite the Docker builders/testers without Jessie support and given that the original problem no longer will apply, these can use the multiarch and cross-building with cross toolchain approach. Preliminary testing show that I can build this way armhf, aarch64 and i686 on Buster (the amd64 is native build) and armhf, aarch64 on Stretch (with amd64 again being the native build) - the only problem is the i686 architecture as there is no gcc-6-i686-linux-gnu or g++-6-i686-linux-gnu packages in Debian Stretch. (There are for Ubuntu Bionic, which is similar system but not quite.)

One way how to solve this is to use multilib for Stretch/i686 build similarly how it is used now or to just build the required toolchain for Stretch i686. (I don't think many people if any will use this so maybe it is all just exercise in futility.)

But all of this will allow to ditch the sysroot chroot and dpkg-shlibdeps patching used for builds now. Not that I have anything against the current system (and it surely wasn't easy to implement), but it is not suitable for the matroska approach with building applications on top of Machinekit-HAL (like the EMCApplication).

I haven't yet tested the build Debian packages on real hardware (and I would be glad for volunteers) except for checking the ELF headers to be right for given architecture.

All in all, I think it is the right way forward. And hopefully on Bullseye it would be so that the cross-builder will work as native one without any changes. (Not all build dependency packages are multiarch co-installable yet in Buster and more so in Stretch.)

cerna commented 4 years ago

Machinekit NG and mk-builder repositories deleted today.

zultron commented 4 years ago

[...] Preliminary testing show that I can build this way armhf, aarch64 and i686 on Buster (the amd64 is native build) and armhf, aarch64 on Stretch (with amd64 again being the native build) - the only problem is the i686 architecture as there is no gcc-6-i686-linux-gnu or g++-6-i686-linux-gnu packages in Debian Stretch. (There are for Ubuntu Bionic, which is similar system but not quite.)

One way how to solve this is to use multilib for Stretch/i686 build similarly how it is used now or to just build the required toolchain for Stretch i686. (I don't think many people if any will use this so maybe it is all just exercise in futility.)

[...] All in all, I think it is the right way forward. And hopefully on Bullseye it would be so that the cross-builder will work as native one without any changes. (Not all build dependency packages are multiarch co-installable yet in Buster and more so in Stretch.)

I can't quite tell what you've done. I think I remember there still being Multi-Arch: problems in MK's (and LCNC's) package dependencies in Stretch, where it was impossible to co-install all the required host-arch dependency packages without apt uninstalling required build-arch dependencies. Have you found this isn't true (at least for most architectures)?

For the i686 problem, can you try using the amd64-arch gcc-6 package on i686, and just pass in CFLAGS=-m32 and LDFLAGS=elf_i386 as we do in the current system?

If you can make this work, it would be absolutely fantastic. The /sysroot hacks were never meant to be permanent. They solved the problem of their time, but they're absolutely hideously hairy and ugly, and it will be a relief if we can get rid of them FOREVER.

machinekit / machinekit-hal

Continuous Integration / Continuous Delivery woes #268

[ ] Implement saner git hook solution using Lefthook