machinekit / machinekit-hal

Universal framework for machine control based on Hardware Abstraction Layer principle
https://www.machinekit.io
Other
106 stars 63 forks source link

Plan to allow running LinuxCNC atop Machinekit-HAL #267

Open cerna opened 4 years ago

cerna commented 4 years ago

I would like to dedicate this issue to tracking progress, ideas and problems with porting (and running) LinuxCNC@master atop the Machinekit-HAL@master. This discussion was already partially broached in machinekit/machinekit-hal#260, where the main problem was pointed out: There needs to be some consensus on how to integrate LinuxCNC onto Machinekit-HAL so the limitations and consequences for Machinekit-HAL development would be minimal or at least clearly defined.

Given that I heard about it in Machinekit's back channels, it's not something generally discussed here in the open. This is detrimetal to any potential lurker here on Machinekit GitHub or in Machinekit Forum who is preparing to do some (serious) development. This way he can chime in here.

Work on this endeavour was already done by @zultron and @Arceye. The former can be previewed in https://github.com/zultron/machinekit-hal/tree/mk-hal-lcnc-and-singlemoddir-zultron-pr, the latter is (probably) passé now (author left the project). Other than these two examples, I am not aware of anybody else working on this. (If you are, please, declare this status here.)

cerna commented 4 years ago

I will bite and describe my musing how I would (try to) go around implementing this. I think that it's imperative to keep the distance between Machinekit-HAL and LinuxCNC. Looking at the current state of LinuxCNC community, I see no move towards more modularized approach. Quite the opposite given new functionality is being added on the heap. So the chances of LinuxCNC going the same route and separating -CNC into its own repository/package are slim to none. So any separation would have to come from Machinekit side.

My theory on how to approach this is something along the lines of exporting all real-time scheduled components/drivers into Machinekit-HAL HAL and rtapi_app execution threads and let LinuxCNC run in normally scheduled (SCHED_OTHER) container with all so-called userspace components (which are needed for LinuxCNC run), with the NML communication between modules in Machinekit-HAL and containerized LinuxCNC open/shared. And with ringbuffer/triple-buffer synchronizing the state of Machinekit-HAL HAL and LinuxCNC HAL.

This would also mean to run emcsrv on Machinekit-HAL side with channels for the RT components and (probably) on the LinuxCNC with channels for everything else (and maybe some form of synchronization). I think it could be little similar like the gross hack described here.

Then I image that Machinekit will have a repository with a branch which will track (from time to time) the LinuxCNC@master. I think that it could mean the least-possible-amount-of-work for keeping the LinuxCNC up to date with Machinekit-HAL.

However, question to those who already tried to run LinuxCNC over Machinekit-HAL: Do I completely miss the mark?

zultron commented 4 years ago

As you noted in the description, I did already get this working.

No major redesign of the existing LCNC-EMC to HAL interface was required. Mainly, there are the motion and io modules. Building these against MK HAL is similar to other out-of-tree module builds. The challenge was teaching the LCNC build system to do this: changes in configure.ac and Makefile/Submakefile systems to configure an external HAL rather than internal, and in that case, turning off the build for the local HAL and building motion and io against the external HAL, with properly located C headers and libraries, plus other details. There were a few minor API changes in MK-HAL that broke compatibility, but I determined that these changes could easily be reverted and the extra functionality the changes provided could be implemented without API breakage.

Otherwise, LCNC-EMC on MK-HAL configurations will use the standard HAL modules that come with MK.

NML is the linkage between motion and io HAL modules and the EMC application milltask process. That should all be left on the LCNC side of the split.

Using the line of split that I chose, the updates on the LCNC side were IIRC on the order of a few hundred lines primarily in the build system, and going forward would be very easy to merge with updates to LCNC. IMO the changes are also trivial enough that the LCNC community might maybe possibly be convinced to merge them upstream, and the MK project wouldn't be required to maintain a forked EMC repo at all.

zultron commented 4 years ago

I've been reviving this project over the last few days, and have the two halves building again. Now I'm working through the regression tests failing on the LCNC-side.

One stumbling block is that mentioned in https://github.com/machinekit/machinekit-hal/issues/104#issuecomment-511530519: Some regression tests run halcompile --install foo.comp, but when the HAL module directory is /usr/lib/linuxcnc/modules, that fails with IOError: [Errno 13] Permission denied. This is a minor problem and there are hacks to work around it, but a proper fix isn't so trivial.

cdsteinkuehler commented 4 years ago

I like the idea of a hal component search path, or at the very least, a system and a user path.

zultron commented 4 years ago

@cdsteinkuehler +1

Here's what I'm thinking seems easiest to implement (before actually having tried anything):

This would solve the use cases I'm aware of, all of which require the standard comps from the default path, plus one or more comps build out of tree. If someone wants to come in later and change the single user path to a searchable set of paths, they could do that without breaking compatibility.

Comments?

zultron commented 4 years ago

This is really close now, except I broke the final CI builds while splitting the work up into four separate PRs. I also broke the ability to install mk-hal and mk-hal-dev packages somewhere in the last three PRs, but I've already fixed it in the forthcoming final PR, probably ready tomorrow.

Lots of good news:

I think it's going to take another day to get a new PR ready to go, but I'm seriously excited about this. Thanks to @cerna for the shiny new CI system, and for holding my hand through it. I'm going to need a lot more before this is done.

cerna commented 4 years ago

Looks like the module hm2_pci.so was lost in translation. Reported here.

BTW, there is nice GUI here where one can look through package files.

Screenshot_20200509_235453


~~I need sleep. I pasted image from armhf... But this one from amd64_9 doesn't have it either.~~


And it's on me. Double great.


Should be solved by #283.

zultron commented 4 years ago

At this point, we should be able to build the below fork of LCNC against MK-HAL.

https://github.com/zultron/machinekit/tree/zultron/2019-07-03-2.8-mk-hal-build

Right now it only works against MK-HAL packages (unless you fix your $PATH to include the MK-HAL scripts/ directory). Check out that branch, then:

cd src
./autogen.sh
./configure --with-hal=machinekit-hal
make -j$(nproc)
sudo make install # installs motion.so, etc. RT modules
source ../scripts/rip-environment
runtests ../tests
linuxcnc ../configs/sim/axis/axis.ini
cerna commented 4 years ago

(Continuation on discussion started in #282.) I think that the process reached a stage, when new repository tracking the LinuxCNC should be created. Given that I would like to avoid misleading name (for example, name Machinekit-HAL is not that great considering HAL is only one part of the package), I would like to reach some kind of agreement. (Basically to not confuse potential users more that they are already going to be).

My end goal is for the new repository to just be clear, evident and obvious (what it is).

Few requirements I can think of:

To that end I was thinking about naming it a LinuxCNCModule/LinuxCNCApplication/EMC3/EMC3Application/EMC3Module or Machinekit-LinuxCNC/Machinekit-EMC or LinuxCNCIntegration/EMC3Integration.

(What is the situation on EMC2, is it usable as a name?)

ebo commented 4 years ago

which machinekit-hal repo and what is the best branch/version to build against this? I finially worked through almost all of the OS/distro issues and do not have hal installed yet...

ebo commented 4 years ago

Before I forget. I have been having one constant issue with rpc.h not found. I temp hacked around it by doing the following:

sed -i "s%AC_CHECK_HEADER(\[rpc\/rpc.h\]%PKG_CHECK_MODULES(\[TIRPC\],\[libtirpc\],\n       [CPPFLAGS=\"\$CPPFLAGS \$TIRPC_CFLAGS -DHAVE_RPC_RPC_H\"; LIBS=\"\$LIBS \$TIRPC_LIBS\";\],\n       [AC_MSG_ERROR(\[libtirpc requested but library not found\])]\n)\nAC_CHECK_HEADER(\[rpc\/rpc.h\]%" configure.ac 

I either need to make a proper patch, or work out how you want this fixed long term -- ie we could also search for TIRPC if the basic test for rpc/rpc.h fails. Would you accept that fail over?

ebo commented 4 years ago

@cerna, yep... lots of confusion -- starting we me... 8-/

ebo commented 4 years ago

I 've gotten stuck on building machinekit-hal -- specifically:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 169: invalid continuation byte

I was able to change enough of the python2'isms to py3 to get most of the way through, but have no idea how to fix the hardcoded strings. If I recall correctly they are generated from a conf string, but I have no idea what that string does...

zultron commented 4 years ago

(Continuation on discussion started in #282.) I think that the process reached a stage, when new repository tracking the LinuxCNC should be created. Given that I would like to avoid misleading name (for example, name Machinekit-HAL is not that great considering HAL is only one part of the package), I would like to reach some kind of agreement. (Basically to not confuse potential users more that they are already going to be).

I'd vote for one of the following:

I'm not too crazy about "module", since that feels like something small and self-contained, which LCNC-EMC is anything but. I do appreciate that it captures the idea of something that can be popped off and switched out for something else.

(What is the situation on EMC2, is it usable as a name?)

I don't see any problem using "EMC2" or "EMC". I'm not too crazy about "EMC3", since that feels like it's a major upgrade over EMC2, which it isn't, aside from the HAL layer. People shouldn't be led to think the MK project is planning to develop the EMC application in some new direction, unless it actually is.

The repo description should further clarify what the name doesn't, perhaps something like "A CNC application built on the Machinekit HAL framework."

zultron commented 4 years ago

I 've gotten stuck on building machinekit-hal -- specifically:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 169: invalid continuation byte

I was able to change enough of the python2'isms to py3 to get most of the way through, but have no idea how to fix the hardcoded strings. If I recall correctly they are generated from a conf string, but I have no idea what that string does...

@ebo Can we talk about this in #114? Also, if you actually want help, please provide enough details, like pointers to the code in question, build logs, pointers to your branch, etc.

cerna commented 4 years ago

"EMCApplication": Even more so!

I think that this is the best. "Enhanced Machine Controller Application" sounds nice and the EMC is still used enough that everybody knows what it is. (Maybe it would be better to have linuxcnc/master branch tracking the LinuxCNC@master or linuxcnc/2.8 tracking the LinuxCNC@2.8 and use the default GitHub branch README.)

EMCApplication is also shorter than LinuxCNCApplication

People shouldn't be led to think the MK project is planning to develop the EMC application in some new direction, unless it actually is.

Agreed.

zultron commented 4 years ago

@ebo

sed -i "s%AC_CHECK_HEADER(\[rpc\/rpc.h\]%PKG_CHECK_MODULES(\[TIRPC\],\[libtirpc\],\n [CPPFLAGS=\"\$CPPFLAGS \$TIRPC_CFLAGS -DHAVE_RPC_RPC_H\"; LIBS=\"\$LIBS \$TIRPC_LIBS\";\],\n [AC_MSG_ERROR(\[libtirpc requested but library not found\])]\n)\nAC_CHECK_HEADER(\[rpc\/rpc.h\]%" configure.ac

What's libtirpc? That file's part of glibc in Debian:

$ dpkg-query -S /usr/include/rpc/rpc.h
libc6-dev:amd64: /usr/include/rpc/rpc.h
ebo commented 4 years ago

On May 12 2020 7:44 PM, John Morris wrote:

@ebo

sed -i "s%AC_CHECK_HEADER(\[rpc\/rpc.h\]%PKG_CHECK_MODULES(\[TIRPC\],\[libtirpc\],\n [CPPFLAGS=\"\$CPPFLAGS \$TIRPC_CFLAGS -DHAVE_RPC_RPC_H\"; LIBS=\"\$LIBS \$TIRPC_LIBS\";\],\n [AC_MSG_ERROR(\[libtirpc requested but library not found\])]\n)\nAC_CHECK_HEADER(\[rpc\/rpc.h\]%" configure.ac

What's libtirpc? That file's part of glibc in Debian:

$ dpkg-query -S /usr/include/rpc/rpc.h
libc6-dev:amd64: /usr/include/rpc/rpc.h

I'm running Gentoo, and rpc is supplied by the libtirpc package

equery belongs /usr/include/tirpc/rpc/rpc.h

   * Searching for /usr/include/tirpc/rpc/rpc.h ...
   net-libs/libtirpc-1.2.5 (/usr/include/tirpc/rpc/rpc.h)

The sed script was a hack to try to get it to compile -- I can change this without having to generate a proper patch files that takes a bit more work. Writing a proper fail over for autoconf to also look for other places will take a bit of thought, and will be done next. This was a temp workaround.

EBo --

ebo commented 4 years ago

On May 12 2020 6:56 PM, cerna wrote:

"EMCApplication": Even more so!

I think that this is the best. "Enhanced Machine Controller Application" sounds nice and the EMC is still used enough that everybody knows what it is. (Maybe it would be better to have linuxcnc/master branch tracking the LinuxCNC@master or linuxcnc/2.8 tracking the LinuxCNC@2.8 and use the default GitHub branch README.)

EMCApplication is also shorter than LinuxCNCApplication

People shouldn't be led to think the MK project is planning to develop the EMC application in some new direction, unless it actually is.

Agreed.

I was about to suggest EMCA, but remembered the legal trademark issues with EMC/EMC2 and looked up EMCA and found -- European Mosquito Control Association (EMCA)...

zultron commented 4 years ago

I'm not too worried about trademark conflicts. Since this project isn't competing in the same industry as the (former) EMC Corporation or the European Mosquito Control Association, (IANAL but) I don't see a basis for legal action. Furthermore, I'm betting that the past issues were caused by EMC2 getting popular enough that search engines put it at the top of results and made the EMC Corp unhappy; I just don't see that happening here.

ebo commented 4 years ago

On May 13 2020 8:40 PM, John Morris wrote:

I'm not too worried about trademark conflicts. Since this project isn't competing in the same industry as the (former) EMC Corporation or the European Mosquito Control Association, (IANAL but) I don't see a basis for legal action. Furthermore, I'm betting that the past issues were caused by EMC2 getting popular enough that search engines put it at the top of results and made the EMC Corp unhappy; I just don't see that happening here.

Fair enough. I just wanted to bring up the point of checking -- a lot of new members were not around in when EMC was forced to change its name...

The real question is do we want to change the name away from machinekit-hal and machinekit-cnc?

cerna commented 4 years ago

Given no contest to EMCApplication (and as enough time passed), I started developing the CI pipeline and quickly realized that it will be more complicated than I thought at first.

Problem is, LinuxCNC project has no rule set about how to contribute new code, and as such everybody with write access is directly pushing to master branch. This has at least two obvious problems: sometimes, they have to rewrite history and not all pushes pass tests. So to bring the changes to local staging branch, I will have to first test if tests passed and if it is a fast-forward merge. This gives another problem, even though LinuxCNC has Github Actions workflow, they only consider code OK if it passes the LinuxCNC buildbot. The buildbot has a JSON API, which is good, but I will have to study how to get the information what I want, which is pretty bad. (What I want basically is to give the buildbot git SHA and for it to respond that tests passed, tests did not pass or that tests do not exist, if somebody has a previous experience.) (I think I found it: jq the 0000.checkin builds.)

Testing if forward only merge is possible or not and hard reset is needed is quite simple (after reading the manual), but given that reset would require a human touch, some notifying service would be nice (I don't think Actions can notify to GitHub Notifications) - I am looking for anybody can subscribe web browser push notification service. (Any ideas?)

This would create a clean local copy of LinuxCNC@master in branch linuxcnc/master.

Then there would be machinekit/master branch which should contain the goodies to support Machinekit-HAL.

Then I was thinking about orphaned infrastructure branch which would contain this whole ugliness of automated deployment. As I think it will need tunings and the machinekit/master branch should stay as clean as possible. Problem is, workflow in infrastructure cannot be triggered by push to any other branch, it can work only on CRON trigger. (And you can run the CRON only on the default or base branch.)

So connecting by repository_dispatch hook and PATs will be needed. On normal repository I would consider it a no-go, but given that the development of EMCApplication would be minimal to non-existent, it should not cause that much trouble.

OK, anybody sees any hole so far (that I cannot)?

Plus, should the EMCApplication repository track only LinuxCNC@master branch or more?

Also, how often should the automatic deployment from linuxcnc/master to machinekit/master happen? And what method should it use before it notifies somebody that human touch is needed? Rebase? Merge? Something else I don't know about?

cerna commented 4 years ago

@ebo,

Fair enough. I just wanted to bring up the point of checking -- a lot of new members were not around in when EMC was forced to change its name...

LinuxCNC is still using the EMC name internally and in forums, I don't think it is going away anytime soon. So I hope that users - even new ones - are inoculated with that knowledge.

The real question is do we want to change the name away from machinekit-hal and machinekit-cnc?

It's little bit different. Nobody is killing Machinekit-HAL or Machinekit-CNC. You are going to need the Machinekit-HAL to run the EMC Application.

Current LinuxCNC project is all on one heap. The HAL, the EMC, couple of other service applications all in one. This is to use the EMC from LinuxCNC only and to cut it in more parts which are simpler to maintain.

Also, as I understand it, the EMCApplication is emergency move to have relevant CNC capabilities. That's the reason why no new development should be put into it, only compability patches.

If somebody is interested in CNC development in Machinekit organization, he can pick up the Machinekit-CNC and do the new development there. Problem is, so far nobody is.

ebo commented 4 years ago

I have had so much trouble getting either linuxcnc or machinekit-hal/cnc to work that I have all but abandoned the idea of getting it to work during the pandemic (when I have a little time here and there to focus on it).

I was wondering why there was junk in the trunk which is a violation of the best practices I was taught years ago. The current workflow description below explains why and how this is. I am not sure you will be able to fix that without a complete break.

I do not have enough experience in setting up these infrastructure to know if it will work or not. But I think chanting the mantra NO junk in the trunk, might wake people up at least a little... Because there always appears to be junk in the trunk, I do not think that you should have any automated transfers from linuxcnc/master to machinekit/master. That means that someone is going to have to review the diffs from the previous data and see if any of them are of interest, and probably have to do that by hand as a remerge.

My experience with the linuxcnc community is that so little effort goes into it that you are unlikely going to get any support from their end. I would dearly LOVE it to be proven wrong, but in the past there was only about 10 to 20 consistent hours/month total effort from everyone going into things, and the few that worked on something had their per peeve they worried at. I have literally been screamed at "DO NOT TOUCH IT", and then they explained that the last person to make such a change took several years to get it stable again. Frankly I think you should not fork from linuxcnc, but to whole-scale start over using best practices and pull chunks and modernize as you go. That said, I doubt that there is sufficient interest for something like that to make it viable.

BTW, when I was banging on getting a gentoo machinekit-hal/cnc ebuild working I had to wipe my personal fork of machinekit because github kept telling me that machinekit was a fork of linuxcnc and as such I already had a fork. Which means that I could not work on them independently.
Since I had so little work on my own machinekit repository I blew it away to see if I could get the python3 support fully working (BTW, I was only able to get one of the branches to build successfully, but then discovered that as far as I can tell no one updated the stuff in the top lib/python directories -- so that when you recompile all the python code everything breaks. As far as I know running py2 compiled code is not guaranteed to work on py3. I could be wrong though...

Anyway, not sure what all I can suggest/help with past here. Will continue to poke along... ...

EBo --

On May 21 2020 6:16 AM, cerna wrote:

Given no contest to EMCApplication (and as enough time passed), I started developing the CI pipeline and quickly realized that it will be more complicated than I thought at first.

Problem is, LinuxCNC project has no rule set about how to contribute new code, and as such everybody with write access is directly pushing to master branch. This has at least two obvious problems: sometimes, they have to rewrite history and not all pushes pass tests. So to bring the changes to local staging branch, I will have to first test if tests passed and if it is a fast-forward merge. This gives another problem, even though LinuxCNC has Github Actions workflow, they only consider code OK if it passes the LinuxCNC buildbot. The buildbot has a JSON API, which is good, but I will have to study how to get the information what I want, which is pretty bad. (What I want basically is to give the buildbot git SHA and for it to respond that tests passed, tests did not pass or that tests do not exist, if somebody has a previous experience.)

Testing if forward only merge is possible or not and hard reset is needed is quite simple (after reading the manual), but given that reset would require a human touch, some notifying service would be nice (I don't think Actions can notify to GitHub Notifications) - I am looking for anybody can subscribe web browser push notification service. (Any ideas?)

This would create a clean local copy of LinuxCNC@master in branch linuxcnc/master.

Then there would be machinekit/master branch which should contain the goodies to support Machinekit-HAL.

Then I was thinking about orphaned infrastructure branch which would contain this whole ugliness of automated deployment. As I think it will need tunings and the machinekit/master branch should stay as clean as possible. Problem is, workflow in infrastructure cannot be triggered by push to any other branch, it can work only on CRON trigger.

So connecting by repository_dispatch hook and PATs will be needed. On normal repository I would consider it a no-go, but given that the development of EMCApplication would be minimal to non-existent, it should not cause that much trouble.

OK, anybody sees any hole so far (that I cannot)?

Plus, should the EMCApplication repository track only LinuxCNC@master branch or more?

Also, how often should the automatic deployment from linuxcnc/master to machinekit/master happen? And what method should it use before it notifies somebody that human touch is needed? Rebase? Merge? Something else I don't know about?

ebo commented 4 years ago

On May 21 2020 6:24 AM, cerna wrote:

@ebo,

Fair enough. I just wanted to bring up the point of checking -- a lot of new members were not around in when EMC was forced to change its name...

LinuxCNC is still using the EMC name internally and in forums, I don't think it is going away anytime soon. So I hope that users - even new ones - are inoculated with that knowledge.

It has been 20 years since the forced name change. It would be good to start using it consistently, but that would take buyin from the community. It just helps from ever getting EMC's lawyers in a tizzy.

The real question is do we want to change the name away from machinekit-hal and machinekit-cnc?

It's little bit different. Nobody is killing Machinekit-HAL or Machinekit-CNC. You are going to need the Machinekit-HAL to run the EMC Application.

I am not sure where the EMC Application is defined, or the exact relationship of machinekit-hal, et al. I still think we should not use EMC in the app name, but, that is just me. Anyway, I was just trying to get stable builds of LinuxCNC and MachineKit for my gentoo preempt systems, as well as RPi's running Gentoo and Raspbian.

Current LinuxCNC project is all on one heap. The HAL, the EMC, couple of other service applications all in one. This is to use the EMC from LinuxCNC only and to cut it in more parts which are simpler to maintain.

OK.

Also, as I understand it, the EMCApplication is emergency move to have relevant CNC capabilities. That's the reason why no new development should be put into it, only compability patches.

I had not heard that. I was not planning to add new development (other than an OS specific source build spec, which will be handled outside of the LCNC/MK repositories.

If somebody is interested in CNC development in Machinekit organization, he can pick up the Machinekit-CNC and do the new development there. Problem is, so far nobody is.

I doubt that I can make enough time to take that over. I already have to many other commitments, and a couple that really need to be pulled off my plate before starting something new.

EBo --

cerna commented 4 years ago

I do not have enough experience in setting up these infrastructure to know if it will work or not. But I think chanting the mantra NO junk in the trunk, might wake people up at least a little... Because there always appears to be junk in the trunk, I do not think that you should have any automated transfers from linuxcnc/master to machinekit/master. That means that someone is going to have to review the diffs from the previous data and see if any of them are of interest, and probably have to do that by hand as a remerge.

I don't think I completely get the junk in trunk reference. Well, probably not even close. I have known of people driving around with anvil in their boots - because they had a rear-wheel drive.

Of course there needs not to be any automated transfer - and I was thinking only about automated merge. Then this whole construct with first pulling to linuxcnc/master and then to machinekit/master would not be needed. The whole branched structure would be unnecessary too.

The idea was to allow auto-magic to lower the need for developer's time. Because no Machinekit project has abundance of it.

My experience with the linuxcnc community is that so little effort goes into it that you are unlikely going to get any support from their end.

LinuxCNC community is not going to change anything which is not directly visible to the user base. I think it is for of world-view difference between software engineers, machine engineers and machinists. Typical software lifecycle of 7 years is never going to be a thing in LinuxCNC community.

(...)everyone going into things, and the few that worked on something had their per peeve they worried at. I have literally been screamed at "DO NOT TOUCH IT", and then they explained that the last person to make such a change took several years to get it stable again.

It's about means. Be it money or interest. Of course in project where nobody is getting paid, people will care the most about their pet peeves. However, I can guarantee that if you have a pet peeve in any Machinekit repository, I will not stand in your way when you try to solve it. Not touching things has both sides - nothing will break during, but when the technical debt catch up to you (and it will, no way around it), it breaks big time.

Frankly I think you should not fork from linuxcnc, but to whole-scale start over using best practices and pull chunks and modernize as you go. That said, I doubt that there is sufficient interest for something like that to make it viable.

I am pragmatist in this (and the forking was before my time), but I think this endeavour would start and never end. You would need proper project management, architectural design, team of programmers. It's not that small project. The original Enhanced Motion Controller had a backing, the OpenCN has a backing (well, I think they call it a fork because of marketing reasons, not much code is ported) - this is a small community and the only viable way forward is gradual change. Anything else would be neck-breaking occurrence (in my opinion).

Anyway, not sure what all I can suggest/help with past here. Will continue to poke along...

Please do. :+1:

I am not sure where the EMC Application is defined, or the exact relationship of machinekit-hal, et al. I still think we should not use EMC in the app name, but, that is just me. Anyway, I was just trying to get stable builds of LinuxCNC and MachineKit for my gentoo preempt systems, as well as RPi's running Gentoo and Raspbian.

What I so far called EMCApplication is @zultron's work of making LinuxCNC work with Machinekit-HAL, respective parts of LinuxCNC on the CNC side. It is available so far in his branch. It needs to be propagated to proper Machinekit organization repository with package distrubution, so people will take notice.

I had not heard that. I was not planning to add new development (other than an OS specific source build spec, which will be handled outside of the LCNC/MK repositories.

Other distribution source build specification can be part of Machinekit-HAL proper and be part of test suite. I have long-running pet peeve of this being so Debian specific.

I still think we should not use EMC in the app name, but, that is just me.

OK, I was using it because there was no protest. Now there actually is one, so I should try to solve it.

What would you use in a name, that would embody the Enhanced Motion Controller, i.e. would be telling that it is a CNC part from LinuxCNC?

ebo commented 4 years ago

On May 21 2020 1:26 PM, cerna wrote:

I do not have enough experience in setting up these infrastructure to know if it will work or not. But I think chanting the mantra NO junk in the trunk, might wake people up at least a little... Because there always appears to be junk in the trunk, I do not think that you should have any automated transfers from linuxcnc/master to machinekit/master. That means that someone is going to have to review the diffs from the previous data and see if any of them are of interest, and probably have to do that by hand as a remerge.

I don't think I completely get the junk in trunk reference. Well, probably not even close. I have known of people driving around with anvil in their boots - because they had a rear-wheel drive.

ROFLOL wonderful image ;-) I once saw someone drive to a gas station with only three tires -- the 4'th was flat and removed, and some heavy weight was put on the hood to boot to help make sure the fender did not do a digger 8-0 I would not do that myself, but it workd for them in an emergency...

The expression junk in trunk has to do with a particular software engineering management philosophy about what one should expect from the a repositories' main (branch/trunk). I cannot remember if Trunk is the equivalent of Branch in CVS, Subversion, Mercurial, etc. But basically if you expect that the main branch is always compilable with the latest/greatest code version. This would be the up to date dev branch.
If it does not compile and breaks, then there is " junk in trunk". It really just comes down to how you are managing the repository. One of my last serious software testing gig's I was able to convince people to keep the junk out of the trunk, and then the automated test suite would not only work, but fill out 95% of the required tracking docs. This was for a non-mission critical flight system...

Of course there needs not to be any automated transfer - and I was thinking only about automated merge. Then this whole construct with first pulling to linuxcnc/master and then to machinekit/master would not be needed. The whole branched structure would be unnecessary too.

Not sure how to set this up given the history of LCNC and the repo.

The idea was to allow auto-magic to lower the need for developer's time. Because no Machinekit project has abundance of it.

agreed. Not sure how to make that happen without revamping both, and even then I am not sure I have the experience to design that workflow.
That said, if you work it out I would be interested in learning how you pulled it off.

My experience with the linuxcnc community is that so little effort goes into it that you are unlikely going to get any support from their end.

LinuxCNC community is not going to change anything which is not directly visible to the user base. I think it is for of world-view difference between software engineers, machine engineers and machinists. Typical software lifecycle of 7 years is never going to be a thing in LinuxCNC community.

Yes, and the memory of prior attempts are so painful that getting the users to accept major software engineering changes are shall we say, ahem..., vocal.

(...)everyone going into things, and the few that worked on something had their per peeve they worried at. I have literally been screamed at "DO NOT TOUCH IT", and then they explained that the last person to make such a change took several years to get it stable again.

It's about means. Be it money or interest. Of course in project where nobody is getting paid, people will care the most about their pet peeves. However, I can guarantee that if you have a pet peeve in any Machinekit repository, I will not stand in your way when you try to solve it. Not touching things has both sides - nothing will break during, but when the technical debt catch up to you (and it will, no way around it), it breaks big time.

Amen brother! Do you know that I spent probably 100 hours over the last year trying to follow the instructions on machinekit.io to get it installed on a RPi3/4 and my Gentoo box. I literally gave up. I would say that both LCNC and MK are both in the bitrot phase. I do not mean to cast aspersions, but several of the dependencies are no longer supported in modern architectures -- like python2 for instance.

Frankly I think you should not fork from linuxcnc, but to whole-scale start over using best practices and pull chunks and modernize as you go. That said, I doubt that there is sufficient interest for something like that to make it viable.

I am pragmatist in this (and the forking was before my time), but I think this endeavour would start and never end. You would need proper project management, architectural design, team of programmers. It's not that small project. The original Enhanced Motion Controller had a backing, the OpenCN has a backing (well, I think they call it a fork because of marketing reasons, not much code is ported) - this is a small community and the only viable way forward is gradual change. Anything else would be neck-breaking occurrence (in my opinion).

Hmmm... I had not heard about OpenCN. I'll look into it as well. BTW, the first time I poked at EMC was in 1999 when I had to sign a document with Fred Proctor at NIST... Yep, the original work was a full on gov grant effort.

Anyway, not sure what all I can suggest/help with past here. Will continue to poke along...

Please do. :+1:

sure, but I am in the same boat -- I have 4+ machines I want/need to rebuild controllers for (and would like to try MK, but already have a long in the took LCNC install). I cannot dedicate my time to this either, so I slice off a few minutes here and there when I either need something or unwinding of an evening. No, I am in the same boat.

I am not sure where the EMC Application is defined, or the exact relationship of machinekit-hal, et al. I still think we should not use EMC in the app name, but, that is just me. Anyway, I was just trying to get stable builds of LinuxCNC and MachineKit for my gentoo preempt systems, as well as RPi's running Gentoo and Raspbian.

What I so far called EMCApplication is @zultron's work of making LinuxCNC work with Machinekit-HAL, respective parts of LinuxCNC on the CNC side. It is available so far in his

branch. It needs to be propagated to proper Machinekit organization repository with package distrubution, so people will take notice.

good to know. I may look at the 2019-07-03-2.8-mk-hal-build. Will ask more questions about the details of the branches so that I know what to start with or try.

I had not heard that. I was not planning to add new development (other than an OS specific source build spec, which will be handled outside of the LCNC/MK repositories.

Other distribution source build specification can be part of Machinekit-HAL proper and be part of test suite. I have long-running pet peeve of this being so Debian specific.

Gentoo uses something called 'portage' which provides a very fine grained dependency control. Not only can I specify min/max versions of dependencies, but also what is incompatible, automatically add a patch to the system without having to do it by hand, build off of git repositories (and you can name branches and specific commit tags)...

I still think we should not use EMC in the app name, but, that is just me.

OK, I was using it because there was no protest. Now there actually is one, so I should try to solve it.

This is an opinion. It does not matter in the grand scheme, but if EMC corp ever comes back then it will matter a lot. This is just honoring the agreement that LinuxCNC made with EMC about trademark conflict.

What would you use in a name, that would embody the Enhanced Motion Controller, i.e. would be telling that it is a CNC part from LinuxCNC?

I think the original issue with EMC or EMC2 was that it is trademarked. As far as I know "Enhanced Motion Controller" is still free game, but not EMC or EMC2 (I forget why the 2 mattered to them but it did). How about "MK-CNC" or just "CNC"?

EBo --

cerna commented 4 years ago

So, I have been playing around with it last couple of days. I think I have got the update-local-tracking-branch-to-remote-one script working OKish - well, the biggest hurdle is getting information about successful build of given SHA from LinuxCNC's buildbot. The current implementation is too precarious for my tastes, it is working around errors specific to this one buildbot that I discovered (but I am sure that I did not catch them all). To be frank, I have a feeling that the whole idea of testing LinuxCNC's code is too precarious.

I have also tried to first rebase and then merge the zultron/2019-07-03-2.8-mk-hal-build onto and to the current LinuxCNC@master HEAD. Haven't finished it yet, but yeah, merging is definitely easier. On the other hand, rebasing made me discover the wonderful powers of git rerere.

However, I will probably rebase the branch on current LinuxCNC@master's HEAD and then will periodically merge in the LinuxCNC upstream changes. @zultron is right, it will create preserved history, which rebasing cannot. It will make the changes fall down in the commit list, but that's not that bad. But maybe it's time to try merging some commits to the LinuxCNC proper, like the b9d38f888b79736ea999a44954a548ab8501f406, 08ee74cb2c5da8fd6d0495f3a95c20d7a0c03b95, 9f77ae98ea58291122c163444eff35b5912eff27 or 372a171e912b0c7cc05644d2bd1cf9f74c1fff3c and maybe squashing the df72791c908bd25e98c94037e9b058b2835323c7 and cedd661bb3cde1980cd39760bed894e97978b17e.(?)


(...)The expression junk in trunk has to do with a particular software engineering management philosophy about what one should expect from the a repositories' main (branch/trunk)(...)

Ah. Thank you for explaining. The machinekit/Machinekit-HAL@master should be compilable at any given time and the tests should run green (well, there is one or two failing non-deterministically). This, of course, is not that useful (or better yet, it doesn't have to be that useful) information from point of actual users. Given the current situation. You can use the Machinekit-HAL and it will work fine - or as fine as ever Machinekit worked. But you cannot use the Machinekit-CNC without (too much) work and the (what I call) EMCApplication is not yet all set up (but is usable).

Amen brother! Do you know that I spent probably 100 hours over the last year trying to follow the instructions on machinekit.io to get it installed on a RPi3/4 and my Gentoo box. I literally gave up. I would say that both LCNC and MK are both in the bitrot phase.

LinuxCNC for sure. Machinekit is escaping crematorium while gas is already flowing. I have never used a source-code based distribution, so have no idea how hard it can be.

Hmmm... I had not heard about OpenCN. I'll look into it as well. BTW, the first time I poked at EMC was in 1999 when I had to sign a document with Fred Proctor at NIST... Yep, the original work was a full on gov grant effort.

OpenCN is using some pretty interesting ideas - like Asymmetrical Multi-Processing and communicating by causing IPI - on one hand, on the other - they are using special hacked version of Xenomai, supporting one EtherCAT piece of hardware - in other words, the use-case is pretty narrow.

Oh, you are an Ancient one. Not many people from that time still interested in the Controller. As far as I know, Proctor was contributing even to LinuxCNC proper, but then disappeared.

sure, but I am in the same boat -- I have 4+ machines I want/need to rebuild controllers for (and would like to try MK, but already have a long in the took LCNC install). I cannot dedicate my time to this either, so I slice off a few minutes here and there when I either need something or unwinding of an evening. No, I am in the same boat.

Few minutes here and there is better than nothing. (In my opinion.)

(...)Gentoo uses something called 'portage' which provides a very fine grained dependency control(...)

Makes me think if it would be possible to integrate into CI build. Given how long it would need to compile everything and if compiling Machinekit-HAL on some Gentoo snapshot in time would be even beneficial.

This is an opinion. It does not matter in the grand scheme, but if EMC corp ever comes back then it will matter a lot. This is just honoring the agreement that LinuxCNC made with EMC about trademark conflict.

I just don't get it how there can even be a claim on it. The EMC2 was abbreviation of Enhanced Motion Controller 2, The EMC Corporation is abbreviation of Egan, Marino & Curly and now it is Dell EMC. There is the energy-mass formula, which is older and few other companies named EMC2. Then there is EMC3, the creative agency.

I was thinking about it for a couple of days (reason why I am responding now), and the truth is, I cannot think of anything. Of anything better, of any reason why it even was trademark conflict, of why the LinuxCNC went for it (well, they didn't have the domain name, and they already had LinuxCNC.org one).

How about "MK-CNC" or just "CNC"?

Problem is, there already is the Machinekit-CNC which is not going away. MK-CNC is too close to it and CNC is quite general (I have a wish to support other controllers in the future - it is wish which will probably never happen, but it is with nonetheless.)

If users are confused with current state of affairs in Machinekit, having Machinekit-CNC and MK-CNC would be even more confusing.

ebo commented 4 years ago

On May 26 2020 4:05 PM, cerna wrote:

So, I have been playing around with it last couple of days. I think I have got the update-local-tracking-branch-to-remote-one script working OKish - well, the biggest hurdle is getting information about successful build of given SHA from LinuxCNC's buildbot. The [current

implementation](https://github.com/cerna/EMCApplication2/blob/702e11368f0fab5a1f2849bdfeb7d2f2f23b0891/.github/workflows/fetch-and-reset-upstream-workflow.yaml#L99) is too precarious for my tastes, it is working around errors specific to this one buildbot that I discovered (but I am sure that I did not catch them all). To be frank, I have a feeling that the whole idea of testing LinuxCNC's code is too precarious.

But without testing your are dying as soon as you are born. The only real way I know to keep such a project alive long term is to automate the tests and have decent coverage -- particularly if you are working on multiple architectures. Then the code base can be checked for consistency.

I have also tried to first rebase and then merge the

zultron/2019-07-03-2.8-mk-hal-build onto and to the current LinuxCNC@master HEAD. Haven't finished it yet, but yeah, merging is definitely easier. On the other hand, rebasing made me discover the wonderful powers of git rerere.

oooo... never heard of git rerere. Interesting. I really need to get my git fu on.

However, I will probably rebase the branch on current LinuxCNC@master's HEAD and then will periodically merge in the LinuxCNC upstream changes. @zultron is right, it will create preserved history, which rebasing cannot. It will make the changes fall down in the commit list, but that's not that bad. But maybe it's time to try merging some commits to the LinuxCNC proper, like the

b9d38f888b79736ea999a44954a548ab8501f406,

08ee74cb2c5da8fd6d0495f3a95c20d7a0c03b95,

9f77ae98ea58291122c163444eff35b5912eff27 or

372a171e912b0c7cc05644d2bd1cf9f74c1fff3c and maybe squashing the

df72791c908bd25e98c94037e9b058b2835323c7 and

cedd661bb3cde1980cd39760bed894e97978b17e.(?)


(...)The expression junk in trunk has to do with a particular software engineering management philosophy about what one should expect from the a repositories' main (branch/trunk)(...)

Ah. Thank you for explaining. The machinekit/Machinekit-HAL@master should be compilable at any given time and the tests should run green (well, there is one or two failing non-deterministically). This, of course, is not that useful (or better yet, it doesn't have to be that useful) information from point of actual users. Given the current situation. You can use the Machinekit-HAL and it will work fine - or as fine as ever Machinekit worked. But you cannot use the Machinekit-CNC without (too much) work and the (what I call) EMCApplication is not yet all set up (but is usable).

I just gave machinekit/Machinekit-HAL@master another brief poke tonight, and it is broken out of the box. Attached patch fixes two issues with configure.ac that causes issues. The first is a minor error in literal version specification which is a hard break. The second is is a conf test for python site location that uses python2 syntax and breaks. That fail gracefully, but causes issues down the line. Then the next point if breaks, irrecoverably, is in various python2 ism's that break the build...

Now all that said, maybe your comment was regarding a clean python2 build. 5 months ago Gentoo has started removing python2 support from the build system, and it is getting to be a pain to set up new builds with python2. In fact most old code is not being fully deprecated.
Anyway, I managed to do it half-automated, and half by hand. Was able to build using python2 out of master with a few worrisome warnings, but yea, it marched along...

Amen brother! Do you know that I spent probably 100 hours over the last year trying to follow the instructions on machinekit.io to get it installed on a RPi3/4 and my Gentoo box. I literally gave up. I would say that both LCNC and MK are both in the bitrot phase.

LinuxCNC for sure. Machinekit is escaping crematorium while gas is already flowing. I have never used a source-code based distribution, so have no idea how hard it can be.

Source code distribution via portage is normally easy. It automates the autogen.sh and configure steps so that it is consistent every time, and explicitly specifies, adding you tell it so, which versions of the libraries it works or does not work with. So, as long as we can get it to compile at all with up to date tool chains, I should be able to automate it.

Hmmm... I had not heard about OpenCN. I'll look into it as well.
BTW, the first time I poked at EMC was in 1999 when I had to sign a document with Fred Proctor at NIST... Yep, the original work was a full on gov grant effort.

OpenCN is using some pretty interesting ideas - like Asymmetrical Multi-Processing and communicating by causing IPI - on one hand, on the other - they are using special hacked version of Xenomai, supporting one EtherCAT piece of hardware - in other words, the use-case is pretty narrow.

I just took a deeper look at OpenNC. I understand why they use tools like matlab for code generation, but long term that is a death blow for long term maintainability. I would have to look at what they are using it for and if it can somehow be optimized in another decently fast language, or not. I also worry about how much baggage it brought along as a LinuxCNC fork -- for all the reasons discussed elsewhere.

Oh, you are an Ancient one. Not many people from that time still interested in the Controller. As far as I know, Proctor was contributing even to LinuxCNC proper, but then disappeared.

I hope Proctor is OK and not fallen into the great chip-bucket. While I am still interested, I have all but completely given up on it.

sure, but I am in the same boat -- I have 4+ machines I want/need to rebuild controllers for (and would like to try MK, but already have a long in the took LCNC install). I cannot dedicate my time to this either, so I slice off a few minutes here and there when I either need something or unwinding of an evening. No, I am in the same boat.

Few minutes here and there is better than nothing. (In my opinion.)

I keep telling myself, and keep getting nowhere useful. Hurmph.

(...)Gentoo uses something called 'portage' which provides a very fine grained dependency control(...)

Makes me think if it would be possible to integrate into CI build. Given how long it would need to compile everything and if compiling Machinekit-HAL on some Gentoo snapshot in time would be even beneficial.

yes. You can configure the build bot to build against known configurations (either on that hardware, or against an emulator). If the test suite is automated, then they can be run nightly, or whenever a PR is made -- if all tests run, and the code is reasonably tested, then there should be no junk in the trunk ;-)

I really do not see any reasonably way forward without a reboot. I have other 'missions' in life other than LCNC/MK. I cannot realistically spend much more time on this at all. The last time I gave a serious poke at something like this I started by setting up an automated test suite, then I picked some small necessary functional unit (say the motion physics) and got the tests working for that. Now move on to all the different languages the old system used, and one by one get some small necessary things working in the test harness. Once you get that, you keep building until something useful is there, and then keep going.

No, the more I think about this the more I need to pull the plug again and work on other things, and that is sad because I really love the idea of LinuxCNC and MachineKit -- it is just not stable enough to be long term viable for me. Sigh....

This is an opinion. It does not matter in the grand scheme, but if EMC corp ever comes back then it will matter a lot. This is just honoring the agreement that LinuxCNC made with EMC about trademark conflict.

I just don't get it how there can even be a claim on it. The EMC2 was abbreviation of Enhanced Motion Controller 2, The EMC Corporation is abbreviation of Egan, Marino & Curly and now it is Dell EMC. There is the energy-mass formula, which is older and few other companies named EMC2. Then there is EMC3, the creative agency.

lawyers and money... it all comes down to that. It does not have to make sense.

I was thinking about it for a couple of days (reason why I am responding now), and the truth is, I cannot think of anything. Of anything better, of any reason why it even was trademark conflict, of why the LinuxCNC went for it (well, they didn't have the domain name, and they already had LinuxCNC.org one).

again, lawyers and money. Someone got a trademark on EMC, and then their lawyers said you cannot use it. They had the law on their side with that one. Yes, they could have offered to disambiguate -- a little link saying "this is the Enhanced Machine Control project. If you are looking for the trademarked company, please go here..." Maybe they would have gone for it, but there was no fighting it without a lot of money.

How about "MK-CNC" or just "CNC"?

Problem is, there already is the Machinekit-CNC which is not going away. MK-CNC is too close to it and CNC is quite general (I have a wish to support other controllers in the future - it is wish which will probably never happen, but it is with nonetheless.)

If users are confused with current state of affairs in Machinekit, having Machinekit-CNC and MK-CNC would be even more confusing.

I do not have a decent solution.

Overall, I need to get back to my other projects. I will continue to monitor this. Once MK or LCNC can dependably be built on Py3 I will look at it again.

EBo --

cerna commented 4 years ago

OK, after few failed attempts I am going to ask: What is the best strategy to rebase the EMCApplication on current LinuxCNC@master in cases, when things which are important for running EMCA atop the Machinekit-HAL are dependent on the HAL implementation in LinuxCNC? I am asking specifically about src/rtapi_string.h (available in /usr/include/machinekit/) - but the EMCA needs the rtapi_strxcpy and such which are implemented only in LinuxCNC's rtapi?

(As this is going to pop up often, I guess the answer of translate it to Machinekit-HAL is not that great.)

zultron commented 4 years ago

I am asking specifically about src/rtapi_string.h (available in /usr/include/machinekit/) - but the EMCA needs the rtapi_strxcpy and such which are implemented only in LinuxCNC's rtapi?

Without having looked at this specific example, I'd think the options would be to port rtapi_strxcpy to MK, or patch EMCApp to use the MK functions in rtapi_string.h, or write a rtapi_strxcpy compatibility wrapper around the closest MK function.

What have you tried that failed?

You're right, this will come up often. I'm guessing we'll have to deal with it on a case-by-case basis, as we've already done with the EMCApp port, unless and until we establish a cooperative relationship with the EMCApp upstream.

cerna commented 4 years ago

What have you tried that failed?

I wanted to splice the headers for a given target and satisfy the lookup for some symbols from one file and for other from another. Generally a black magic and not something what I can do with my poorish understanding of Make. And the more I think about it, the more I can see it was moronic.

Without having looked at this specific example, I'd think the options would be to port rtapi_strxcpy to MK, or patch EMCApp to use the MK functions in rtapi_string.h, or write a rtapi_strxcpy compatibility wrapper around the closest MK function.

These changes are specific to compiler warning "suppresion" in connection to print-like functions and buffer overflow. I don't have a problem with it, I just don't know if the solution in LinuxCNC is a good one.

I guess what should be decided how the EMCApp will go about creating the portability shims in the long run, so it is somehow standardized. (Given that the original idea of only minimal set of commits will not be possible, respectively better said will grow in time in numbers.) So it is accessible for potential future maintainers.

Should some documentations log be kept?

cerna commented 4 years ago

I have spent some time on this again and got it into a -lets say - functioning state. (I haven't yet got to packaging it for Debian, as I had more interesting things to do, but I am going to.)

With dropping the commits which were included in LinuxCNC@master and then rebasing, plus adding the linking against the $(L_ULAPI) (because of the symbols defined [and needed] in rtapi_string.h) and $(LIB_HAL_SO) where applicable, I reached:

Runtest: 211 tests run, 203 successful, 8 failed + 0 expected
Failed: 
    ../tests/build/ui
    ../tests/hal-link-unlink
    ../tests/halmodule.0
    ../tests/interp/compile
    ../tests/interp/plug
    ../tests/lowlevel/mutex
    ../tests/pyhal
    ../tests/tclsh-extensions

I don't completely understand what is the problem with the ui test yet. I have a feeling that some of these failures are caused by the Run tests against system install pull request in LinuxCNC, respective how these tests presume library installations which are not there when I am running EMCApplication in half-installed, half-RIP state. For example from the ui I can get:

+ set -x
+ g++ -I nml-position-logger.cc -L -lnml -llinuxcnc -o /dev/null
/usr/bin/ld: cannot find -llinuxcnc
collect2: error: ld returned 1 exit status

After copying all libraries (shared and archives) from $EMCA/lib to /usr/lib, it suddenly can see it (but there is still some problem):

+ set -x
+ g++ -I nml-position-logger.cc -L -lnml -llinuxcnc -o /dev/null
/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/Scrt1.o: in function `_start':
(.text+0x20): undefined reference to `main'
collect2: error: ld returned 1 exit status

The hal-link-unlink test has a problem with Python:

+ start
./test.sh: line 3: start: command not found
+ linuxcnc-python hallink.py
Traceback (most recent call last):
  File "hallink.py", line 10, in <module>
    h.newpin("in", hal.HAL_FLOAT, hal.HAL_IN)
  File "/mk/lib/python/hal.py", line 67, in newpin
    def newpin(self, *a, **kw): return Pin(_hal.component.newpin(self, *a, **kw))
MemoryError: hal_malloc failed
+ stop
./test.sh: line 6: stop: command not found
+ exit 127

But this (to me at least) looks like a HAL related issue, in other words it is in domain of Machinekit-HAL and outside of scope for EMCApplication.

The same for halmodule.0 test:

+ start
./test.sh: line 2: start: command not found
+ ./test.py
Traceback (most recent call last):
  File "./test.py", line 7, in <module>
    ps = h.newpin("s", hal.HAL_S32, hal.HAL_OUT)
  File "/mk/lib/python/hal.py", line 67, in newpin
    def newpin(self, *a, **kw): return Pin(_hal.component.newpin(self, *a, **kw))
MemoryError: hal_malloc failed
+ stop
./test.sh: line 4: stop: command not found

This too looks as out of scope for EMCApplication.

The interp/compile:

+ set -xe
+ test -n machinekit-hal
++ pkg-config --libs machinekit-hal
+ HAL_LIBS=
++ pkg-config --variable lib_rtapi_math machinekit-hal
+ L_RTAPI_MATH=-lrtapi_math
++ pkg-config --variable lib_hal machinekit-hal
+ L_HAL=-lhal
+ g++ -o use-rs274 use-rs274.cc -Wall -Wextra -Wno-return-type -Wno-unused-parameter -I -I -L -Wl,-rpath, -lrs274 -lrtapi_math -lhal
use-rs274.cc:17:10: fatal error: Python.h: No such file or directory
 #include <Python.h> // must be first header
          ^~~~~~~~~~
compilation terminated.

Ditto.

The interp/plug test:

+ rs274 -p /libcanterp.so -g canon
+ awk '{$1=""; print}'
interp_from_shlib(/libcanterp.so)
emcTaskInit: could not open interpreter '/libcanterp.so': /libcanterp.so: cannot open shared object file: No such file or directory
emcTaskInit: could not open interpreter '/libcanterp.so': /mk/lib/emc2//libcanterp.so: cannot open shared object file: No such file or directory
executing
Bad character '1' used
  1 N..... USE_LENGTH_UNITS(CANON_UNITS_INCHES)
+ exit 1

This frankly looks like a wrong substation of where the library is located. (Again, when I patch the test to point directly to /usr/lib/libcanterp.so, the test runs green.)

The lowlevel/mutex fails:

+ test -n machinekit-hal
++ pkg-config --cflags machinekit-hal
+ CFLAGS=-I/usr/include/machinekit
+ gcc -O -I ./test.c -o test -DULAPI -std=gnu99 -I/usr/include/machinekit -pthread
gcc: fatal error: no input files
compilation terminated.
+ exit 1

But this is caused by the fact, that the $HEADERS variable is not set when running in this state, the moment one sets it to $EMCA/include, the test runs green.

The pyhal test:

Traceback (most recent call last):
  File "./test", line 2, in <module>
    from pyhal import *
  File "/mk/lib/python/pyhal.py", line 6, in <module>
    lib = CDLL('liblinuxcnchal.so')
  File "/usr/lib/python2.7/ctypes/__init__.py", line 366, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: liblinuxcnchal.so: cannot open shared object file: No such file or directory

The tests/tclsh-extensions test fails on inability to start Machinekit-HAL:

+ start
./test.sh: line 5: start: command not found
+ /usr/bin/tclsh8.6 test.tcl
halcmd: cant connect to rtapi_app: -1 (uri= uuid=76f1c993-f8b4-43f7-9dae-ffe09d161a0f): rtapi_rpc(): reply timeout

halcmd: the rtapi:0 RT demon is not running - please investigate /var/log/hal.log
halcmd: the msgd:0 logger demon is not running - please investigate /var/log/hal.log
./test.sh: line 6: 24938 Segmentation fault      (core dumped) ${LINUXCNC_EMCSH/wish/tclsh} test.tcl
+ exitval=139
+ stop
./test.sh: line 7: stop: command not found
+ exit 139

However, this is caused by wrong substitution of $REALTIME variable, when I patch the test to use the Machinekit-HAL installed one, the test runs green.


I could - of course - patch these tests to run green with EXTRENAL_HAL and disable ones which are intended more for INTERNAL_HAL, but I am thinking if this is worth it given that:

But looking at the whole buildsystem, I have feeling that it is completelly unstable and that crash is waiting sometime in the future to happen. Which is not very comforting.

ebo commented 4 years ago

what branch are you working on, or is this in the main branch?

On Jul 19 2020 12:48 PM, cerna wrote:

I have spent some time on this again and got it into a -lets say - functioning state. (I haven't yet got to packaging it for Debian, as I had more interesting things to do, but I am going to.)

With dropping the commits which were included in LinuxCNC@master and then rebasing, plus adding the linking against the $(L_ULAPI) (because of the symbols defined [and needed] in rtapi_string.h) and $(LIB_HAL_SO) where applicable, I reached:

Runtest: 211 tests run, 203 successful, 8 failed + 0 expected
Failed:
  ../tests/build/ui
  ../tests/hal-link-unlink
  ../tests/halmodule.0
  ../tests/interp/compile
  ../tests/interp/plug
  ../tests/lowlevel/mutex
  ../tests/pyhal
  ../tests/tclsh-extensions

I don't completely understand what is the problem with the ui test yet. I have a feeling that some of these failures are caused by the Run tests against system install pull request in LinuxCNC, respective how these tests presume library installations which are not there when I am running EMCApplication in half-installed, half-RIP state. For example from the ui I can get:

+ set -x
+ g++ -I nml-position-logger.cc -L -lnml -llinuxcnc -o /dev/null
/usr/bin/ld: cannot find -llinuxcnc
collect2: error: ld returned 1 exit status

After copying all libraries (shared and archives) from $EMCA/lib to /usr/lib, it suddenly can see it (but there is still some problem):

+ set -x
+ g++ -I nml-position-logger.cc -L -lnml -llinuxcnc -o /dev/null
/usr/bin/ld:
/usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/Scrt1.o: in
function `_start':
(.text+0x20): undefined reference to `main'
collect2: error: ld returned 1 exit status

The hal-link-unlink test has a problem with Python:

+ start
./test.sh: line 3: start: command not found
+ linuxcnc-python hallink.py
Traceback (most recent call last):
  File "hallink.py", line 10, in <module>
    h.newpin("in", hal.HAL_FLOAT, hal.HAL_IN)
  File "/mk/lib/python/hal.py", line 67, in newpin
    def newpin(self, *a, **kw): return
Pin(_hal.component.newpin(self, *a, **kw))
MemoryError: hal_malloc failed
+ stop
./test.sh: line 6: stop: command not found
+ exit 127

But this (to me at least) looks like a HAL related issue, in other words it is in domain of Machinekit-HAL and outside of scope for EMCApplication.

The same for halmodule.0 test:

+ start
./test.sh: line 2: start: command not found
+ ./test.py
Traceback (most recent call last):
  File "./test.py", line 7, in <module>
    ps = h.newpin("s", hal.HAL_S32, hal.HAL_OUT)
  File "/mk/lib/python/hal.py", line 67, in newpin
    def newpin(self, *a, **kw): return
Pin(_hal.component.newpin(self, *a, **kw))
MemoryError: hal_malloc failed
+ stop
./test.sh: line 4: stop: command not found

This too looks as out of scope for EMCApplication.

The interp/compile:

+ set -xe
+ test -n machinekit-hal
++ pkg-config --libs machinekit-hal
+ HAL_LIBS=
++ pkg-config --variable lib_rtapi_math machinekit-hal
+ L_RTAPI_MATH=-lrtapi_math
++ pkg-config --variable lib_hal machinekit-hal
+ L_HAL=-lhal
+ g++ -o use-rs274 use-rs274.cc -Wall -Wextra -Wno-return-type
-Wno-unused-parameter -I -I -L -Wl,-rpath, -lrs274 -lrtapi_math -lhal
use-rs274.cc:17:10: fatal error: Python.h: No such file or directory
 #include <Python.h> // must be first header
          ^~~~~~~~~~
compilation terminated.

Ditto.

The interp/plug test:

+ rs274 -p /libcanterp.so -g canon
+ awk '{$1=""; print}'
interp_from_shlib(/libcanterp.so)
emcTaskInit: could not open interpreter '/libcanterp.so':
/libcanterp.so: cannot open shared object file: No such file or
directory
emcTaskInit: could not open interpreter '/libcanterp.so':
/mk/lib/emc2//libcanterp.so: cannot open shared object file: No such
file or directory
executing
Bad character '1' used
  1 N..... USE_LENGTH_UNITS(CANON_UNITS_INCHES)
+ exit 1

This frankly looks like a wrong substation of where the library is located. (Again, when I patch the test to point directly to /usr/lib/libcanterp.so, the test runs green.)

The lowlevel/mutex fails:

+ test -n machinekit-hal
++ pkg-config --cflags machinekit-hal
+ CFLAGS=-I/usr/include/machinekit
+ gcc -O -I ./test.c -o test -DULAPI -std=gnu99
-I/usr/include/machinekit -pthread
gcc: fatal error: no input files
compilation terminated.
+ exit 1

But this is caused by the fact, that the $HEADERS variable is not set when running in this state, the moment one sets it to $EMCA/include, the test runs green.

The pyhal test:

Traceback (most recent call last):
  File "./test", line 2, in <module>
    from pyhal import *
  File "/mk/lib/python/pyhal.py", line 6, in <module>
    lib = CDLL('liblinuxcnchal.so')
  File "/usr/lib/python2.7/ctypes/__init__.py", line 366, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: liblinuxcnchal.so: cannot open shared object file: No such
file or directory

The tests/tclsh-extensions test fails on inability to start Machinekit-HAL:

+ start
./test.sh: line 5: start: command not found
+ /usr/bin/tclsh8.6 test.tcl
halcmd: cant connect to rtapi_app: -1 (uri=
uuid=76f1c993-f8b4-43f7-9dae-ffe09d161a0f): rtapi_rpc(): reply 
timeout

halcmd: the rtapi:0 RT demon is not running - please investigate
/var/log/hal.log
halcmd: the msgd:0 logger demon is not running - please investigate
/var/log/hal.log
./test.sh: line 6: 24938 Segmentation fault      (core dumped)
${LINUXCNC_EMCSH/wish/tclsh} test.tcl
+ exitval=139
+ stop
./test.sh: line 7: stop: command not found
+ exit 139

However, this is caused by wrong substitution of $REALTIME variable, when I patch the test to use the Machinekit-HAL installed one, the test runs green.


I could - of course - patch these tests to run green with EXTRENAL_HAL and disable ones which are intended more for INTERNAL_HAL, but I am thinking if this is worth it given that:

  • I am going to merge only good LinuxCNC@master commits from linuxcnc/master branch which were already tested by LinuxCNC and deemed OK (LinuxCNC's branches are surprisingly unstable and stay broken for longish times [but this issue is not for complaining about lack of formal contributing workflow and devOps in LinuxCNC]) and as such I don't consider the compiling tests as that important (at the moment)
  • The AXIS with Enhanced Motion Controller and Kinematics seems to work fine
  • Getting the packages for Debian to actually exist is probably more important for life of Machinekit

But looking at the whole buildsystem, I have feeling that it is completelly unstable and that crash is waiting sometime in the future to happen. Which is not very comforting.

cerna commented 4 years ago

@ebo, I have it currently in a local branch (the LinuxCNC/EMCA side) as I am not so sure I didn't make some stupid mistake (and I don't enjoy public humiliation very much). I will check it/repair it and post it to public Github repository. The problem is also the Debian packaging, which is currently missing and so not very usable without the use of Docker or some chroot.

The change which was needed on Machinekit-HAL side is in its master.

zultron commented 4 years ago

I have spent some time on this again and got it into a -lets say - functioning state. (I haven't yet got to packaging it for Debian, as I had more interesting things to do, but I am going to.)

Great progress! I know personally what kind of grueling work this is, and I really appreciate you doing it!

I don't completely understand what is the problem with the ui test yet.

Found the test:

$ (cd tests/; find * -name ui)
build/ui

I have a feeling that some of these failures are caused by the Run tests against system install pull request in LinuxCNC, respective how these tests presume library installations which are not there when I am running EMCApplication in half-installed, half-RIP state.

I'm sure I messed things up for Machinekit there, if not for LinuxCNC, too. (I have another outstanding complaint about that PR already.)

For example from the ui I can get:

+ set -x
+ g++ -I nml-position-logger.cc -L -lnml -llinuxcnc -o /dev/null
/usr/bin/ld: cannot find -llinuxcnc
collect2: error: ld returned 1 exit status

I'm moving house and don't have a dev environment readily in front of me.

Looks like you're running by hand, that's why. The runtests script sets some environment variables for tests, which I now see causes their test.sh script to fail when run by hand. In this case, the $HEADERS env var isn't set.

The problem here is how to pull in build-time configuration. For the test suite, I tried to accomplish this by templating runtests in that PR you mentioned above.. LinuxCNC doesn't have one, so this is actually harder to fix on the LCNC side. On the MK side, try reading $CFLAGS and $LIBS from pkg-config within the test.sh script.

After copying all libraries (shared and archives) from $EMCA/lib to /usr/lib, it suddenly can see it (but there is still some problem):

+ set -x
+ g++ -I nml-position-logger.cc -L -lnml -llinuxcnc -o /dev/null
/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/8/../../../x86_64-linux-gnu/Scrt1.o: in function `_start':
(.text+0x20): undefined reference to `main'
collect2: error: ld returned 1 exit status

I don't understand this, either.

The hal-link-unlink test has a problem with Python:

+ start
./test.sh: line 3: start: command not found
+ linuxcnc-python hallink.py
Traceback (most recent call last):
  File "hallink.py", line 10, in <module>
    h.newpin("in", hal.HAL_FLOAT, hal.HAL_IN)
  File "/mk/lib/python/hal.py", line 67, in newpin
    def newpin(self, *a, **kw): return Pin(_hal.component.newpin(self, *a, **kw))
MemoryError: hal_malloc failed
+ stop
./test.sh: line 6: stop: command not found
+ exit 127

Same problem, realtime script not setting the $REALTIME env var, causing $REALTIME start to evaluate to start ("command not found"), so HAL isn't running.

This too looks as out of scope for EMCApplication.

The interp/compile:

+ set -xe
+ test -n machinekit-hal
++ pkg-config --libs machinekit-hal
+ HAL_LIBS=
++ pkg-config --variable lib_rtapi_math machinekit-hal
+ L_RTAPI_MATH=-lrtapi_math
++ pkg-config --variable lib_hal machinekit-hal
+ L_HAL=-lhal
+ g++ -o use-rs274 use-rs274.cc -Wall -Wextra -Wno-return-type -Wno-unused-parameter -I -I -L -Wl,-rpath, -lrs274 -lrtapi_math -lhal
use-rs274.cc:17:10: fatal error: Python.h: No such file or directory
 #include <Python.h> // must be first header
          ^~~~~~~~~~
compilation terminated.

Not sure how to add -I /usr/include/python2.7 (or other appropriate Python version) to build-time configuration.

The interp/plug test:

$LIBDIR not set.

This frankly looks like a wrong substation of where the library is located. (Again, when I patch the test to point directly to /usr/lib/libcanterp.so, the test runs green.)

The lowlevel/mutex fails:


+ test -n machinekit-hal
++ pkg-config --cflags machinekit-hal
+ CFLAGS=-I/usr/include/machinekit
+ gcc -O -I ./test.c -o test -DULAPI -std=gnu99 -I/usr/include/machinekit -pthread

$HEADERS unset.

But this is caused by the fact, that the $HEADERS variable is not set when running in this state, the moment one sets it to $EMCA/include, the test runs green.

I'm really sorry I don't remember how I implemented these, disjointly, on the MK and LCNC sides. At the time I was thinking this would become a problem, and that I'd step up and fix it when it did.

Now I'm moving my family back to my home country, so I'm not going to be able to fix it right away.

If you don't want to fix it yourself (I wouldn't), maybe you can rebase your development onto an upstream LCNC commit from just before that PR. That will give me some slack to fix the problem I introduced.

cerna commented 4 years ago

Getting into packaging it for Debian, it seems fairly straightforward (after I discovered that EMC2_RTLIB_DIR wasn't for some reason exported for external HAL). I am not so sure about the warnings in following output - if it is worth it to trying to silence them (seems to work fine as is):

dh_testdir
dh_testroot
dh_installchangelogs
dh_installdocs
dh_installexamples
dh_installman
dh_installmime
dh_link
dh_strip
dh_compress -X.pdf -X.txt -X.hal -X.ini -X.clp -X.var -X.nml -X.tbl -X.xml -Xsample-configs
dh_fixperms -X/linuxcnc_module_helper -X/rtapi_app
dh_python2
I: dh_python2 fs:343: renaming _togl.so to _togl.x86_64-linux-gnu.so
I: dh_python2 fs:343: renaming gcode.so to gcode.x86_64-linux-gnu.so
I: dh_python2 fs:343: renaming lineardeltakins.so to lineardeltakins.x86_64-linux-gnu.so
I: dh_python2 fs:343: renaming linuxcnc.so to linuxcnc.x86_64-linux-gnu.so
I: dh_python2 fs:343: renaming minigl.so to minigl.x86_64-linux-gnu.so
I: dh_python2 fs:343: renaming rotarydeltakins.so to rotarydeltakins.x86_64-linux-gnu.so
dh_makeshlibs
dh_installdeb
cat debian/emcapplication/DEBIAN/shlibs debian/shlibs.pre > debian/shlibs.local
dh_shlibdeps -l debian/emcapplication/usr/lib
dpkg-shlibdeps: warning: symbol rtapi_acos used by debian/emcapplication/usr/lib/libposemath.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol rtapi_sin used by debian/emcapplication/usr/lib/libposemath.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol rtapi_fmax used by debian/emcapplication/usr/lib/libposemath.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol rtapi_asin used by debian/emcapplication/usr/lib/libposemath.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol rtapi_cos used by debian/emcapplication/usr/lib/libposemath.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol rtapi_fabs used by debian/emcapplication/usr/lib/libposemath.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol rtapi_pow used by debian/emcapplication/usr/lib/libposemath.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol rtapi_sqrt used by debian/emcapplication/usr/lib/libposemath.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol rtapi_atan2 used by debian/emcapplication/usr/lib/libposemath.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol _Z17STRAIGHT_TRAVERSEiddddddddd used by debian/emcapplication/usr/lib/librs274.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol _Z8ON_RESETv used by debian/emcapplication/usr/lib/librs274.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol _Z14ORIENT_SPINDLEidi used by debian/emcapplication/usr/lib/librs274.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol _Z10INIT_CANONv used by debian/emcapplication/usr/lib/librs274.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol _Z22DISABLE_SPEED_OVERRIDEi used by debian/emcapplication/usr/lib/librs274.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol _Z22START_SPEED_FEED_SYNCHidb used by debian/emcapplication/usr/lib/librs274.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol _Z23GET_EXTERNAL_POSITION_Bv used by debian/emcapplication/usr/lib/librs274.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol _Z20CLEAR_AUX_OUTPUT_BITi used by debian/emcapplication/usr/lib/librs274.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol _Z31STOP_CUTTER_RADIUS_COMPENSATIONv used by debian/emcapplication/usr/lib/librs274.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol _Z20ENABLE_ADAPTIVE_FEEDv used by debian/emcapplication/usr/lib/librs274.so.0 found in none of the libraries
dpkg-shlibdeps: warning: 138 other similar warnings have been skipped (use -v to see them all)
dpkg-shlibdeps: warning: can't extract name and version from library name 'libtk8.6.so'
dpkg-shlibdeps: warning: symbol _Z7COMMENTPKc used by debian/emcapplication/usr/lib/libcanterp.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol _Z22DISABLE_SPEED_OVERRIDEi used by debian/emcapplication/usr/lib/libcanterp.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol _Z22START_SPEED_FEED_SYNCHidb used by debian/emcapplication/usr/lib/libcanterp.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol _Z13SET_FEED_RATEd used by debian/emcapplication/usr/lib/libcanterp.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol _Z20STOP_SPINDLE_TURNINGi used by debian/emcapplication/usr/lib/libcanterp.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol _Z11PROGRAM_ENDv used by debian/emcapplication/usr/lib/libcanterp.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol _Z12SELECT_PLANE11CANON_PLANE used by debian/emcapplication/usr/lib/libcanterp.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol _Z18SET_FEED_REFERENCE20CANON_FEED_REFERENCE used by debian/emcapplication/usr/lib/libcanterp.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol _Z16USE_LENGTH_UNITS11CANON_UNITS used by debian/emcapplication/usr/lib/libcanterp.so.0 found in none of the libraries
dpkg-shlibdeps: warning: symbol _Z14STRAIGHT_PROBEidddddddddh used by debian/emcapplication/usr/lib/libcanterp.so.0 found in none of the libraries
dpkg-shlibdeps: warning: 26 other similar warnings have been skipped (use -v to see them all)
dpkg-shlibdeps: warning: can't extract name and version from library name 'libtk8.6.so'
dpkg-shlibdeps: warning: can't extract name and version from library name 'libtk8.6.so'
dpkg-shlibdeps: warning: package could avoid a useless dependency if debian/emcapplication/usr/lib/python2.7/dist-packages/_togl.x86_64-linux-gnu.so debian/emcapplication/usr/lib/tcltk/linuxcnc/linuxcnc.so debian/emcapplication/usr/lib/tcltk/linuxcnc/hal.so were not linked against libXss.so.1 (they use none of the library's symbols)
dpkg-shlibdeps: warning: package could avoid a useless dependency if debian/emcapplication/usr/lib/python2.7/dist-packages/_togl.x86_64-linux-gnu.so debian/emcapplication/usr/lib/tcltk/linuxcnc/linuxcnc.so debian/emcapplication/usr/lib/tcltk/linuxcnc/hal.so were not linked against libfreetype.so.6 (they use none of the library's symbols)
dpkg-shlibdeps: warning: package could avoid a useless dependency if debian/emcapplication/usr/lib/python2.7/dist-packages/_togl.x86_64-linux-gnu.so debian/emcapplication/usr/lib/tcltk/linuxcnc/linuxcnc.so debian/emcapplication/usr/lib/tcltk/linuxcnc/hal.so were not linked against libz.so.1 (they use none of the library's symbols)
dpkg-shlibdeps: warning: package could avoid a useless dependency if debian/emcapplication/usr/lib/python2.7/dist-packages/_togl.x86_64-linux-gnu.so debian/emcapplication/usr/lib/tcltk/linuxcnc/linuxcnc.so debian/emcapplication/usr/lib/tcltk/linuxcnc/hal.so were not linked against libfontconfig.so.1 (they use none of the library's symbols)
dpkg-shlibdeps: warning: package could avoid a useless dependency if debian/emcapplication/usr/lib/python2.7/dist-packages/_togl.x86_64-linux-gnu.so debian/emcapplication/usr/lib/tcltk/linuxcnc/linuxcnc.so debian/emcapplication/usr/lib/tcltk/linuxcnc/hal.so were not linked against libXft.so.2 (they use none of the library's symbols)
dpkg-shlibdeps: warning: package could avoid a useless dependency if debian/emcapplication/usr/lib/python2.7/dist-packages/_togl.x86_64-linux-gnu.so debian/emcapplication/usr/lib/tcltk/linuxcnc/linuxcnc.so debian/emcapplication/usr/lib/tcltk/linuxcnc/hal.so were not linked against libXext.so.6 (they use none of the library's symbols)
dh_gencontrol
dpkg-gencontrol: warning: Depends field of package emcapplication-dev: substitution variable ${python:Depends} used, but is not defined
dpkg-gencontrol: warning: package emcapplication: substitution variable ${python:Versions} unused, but is defined
dpkg-gencontrol: warning: package emcapplication: substitution variable ${python:Versions} unused, but is defined
dh_md5sums
dh_builddeb
dpkg-deb: building package 'emcapplication-dev' in '../emcapplication-dev_2.9.0~pre0_amd64.deb'.
dpkg-deb: building package 'emcapplication' in '../emcapplication_2.9.0~pre0_amd64.deb'.
dpkg-deb: building package 'emcapplication-dbgsym' in '../emcapplication-dbgsym_2.9.0~pre0_amd64.deb'.
 dpkg-genbuildinfo --build=binary
 dpkg-genchanges --build=binary >../linuxcnc_2.9.0~pre0_amd64.changes
dpkg-genchanges: info: binary-only upload (no source code included)
 dpkg-source --after-build .
dpkg-buildpackage: info: binary-only upload (no source included)

I am also working with the presumption that the versioning (and name of the package) will be same as in Machinekit-HAL. But should the LinuxCNC version of (as of now) 2.9 be included, or should it start from 0.1? Both emcapplication and emcapplication-dev will be probably needed (just for tests).

I am also thinking it would be good to delete the documentation and pncconf (and friends) from the package, as these are potential sources of misunderstanding (and it is better no documentation at all than wrong one).

Building for other architectures will be done in QEMU as cross-compiling would be too hard to hack in (and the point is to create minimal changes anyway) - GitHub Actions worker can run for six hours straight anyway.

BTW, I have looked that there is already some basic packaging work done (in debian/configure and so) - have you had something specific in mind, @zultron?

The problem here is how to pull in build-time configuration. For the test suite, I tried to accomplish this by templating runtests in that PR you mentioned above.. LinuxCNC doesn't have one, so this is actually harder to fix on the LCNC side. On the MK side, try reading $CFLAGS and $LIBS from pkg-config within the test.sh script.

Hmm, I can see it. Another problem is that Machinekit-hal-dev install the Machinekit specific runtests program, which is of course different to LinuxCNC's one.

If you don't want to fix it yourself (I wouldn't), maybe you can rebase your development onto an upstream LCNC commit from just before that PR. That will give me some slack to fix the problem I introduced.

I have been looking at that pull request and the main problem I can see is that all I can think of is to add another convoluted layer (or two) which is unsustainable in the long run.

I can try to half-arse it somehow and then you can repair it for good when you have the time.

I probably should have kept my mouth shut instead of mentioning it to the LinuxCNC developers that the 888 pull request introduced an error and failure. Given how the buildflow is designed (and enforced), chances are that there wouldn't have been any green builds to this day and this would be a moot point (as I plan to only incorporate green patches). :wink:

cerna commented 4 years ago

Let's call it Preview (or Beta), but I put it into Machinekit/EMCApplication repository, packages can be had from Machinekit/EMCApplication.

The tests are still not all running green, and I am sure I fucked something up during the rebasing/updating but it was lying around long enough, so I decided to just let people to use it and see.

The Simplest way how to try it:

docker run --rm -it -e DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix -v /dev/dri:/dev/dri debian:buster
apt update
apt install sudo curl
echo "ALL ALL = (ALL) NOPASSWD: ALL" >> /etc/sudoers
adduser mk
su mk
curl -1sLf \
  'https://dl.cloudsmith.io/public/machinekit/machinekit/cfg/setup/bash.deb.sh' \
  | sudo -E bash
curl -1sLf \
  'https://dl.cloudsmith.io/public/machinekit/machinekit-hal/cfg/setup/bash.deb.sh' \
  | sudo -E bash
curl -1sLf \
  'https://dl.cloudsmith.io/public/machinekit/emcapplication/cfg/setup/bash.deb.sh' \
  | sudo -E bash
sudo apt install emcapplication
sudo -i
echo -e 'ANNOUNCE_IPV4=0\nANNOUNCE_IPV6=0' >> /etc/machinekit/machinekit.ini
exit
linuxcnc

Let me know the problems.

ebo commented 3 years ago

I had cause to pull all of this up again and take a look at the progress. In short I got it to compile and install on Gentoo, but OH MY LADY GAGA it was a process... What follows are some notes to help others kick the bucket down the road as it is unlikely that I can take the time to move this fully forward.

First, my gentoo overlay (which is all cruded up and not ready for prime time, but...) is at https://github.com/ebo/GentooCNC_RPi I only compiled/installed this, and have not tested it at all. It also would not compile unless I had modbus configured and installed (so the base build system does not handle the lack of modbus properly).

The stampeding elephants coming through the door are yapps and pygtk. Yapps has not been updated since 2014, and is considered deprecated on every platform I know of, and pygtk strictly stats that it has never and will never support python3... I looked around and it looks like the tools that use pygtk might be able to use PyGObject. I have no idea how much work that will entail, or if it is feasible. Regardless, it is likely a big task. As for yapps, I have included a modified deprecated ebuild so it can build out of the box as well.

Also to note, I based this off the code from https://github.com/machinekit/machinekit-hal and not https://github.com/zultron/machinekit-hal. If that was in error, I can switch back, but it looks like the main repository is the most up to date.

I will try to take a poke and test this later, but I have only drips and drabs of time at the moment.

BTW, are diffs sufficient at this time, or do you need proper pull requests? Not sure I will have enough to make it worth folks while, but who knows.

cerna commented 3 years ago

The stampeding elephants coming through the door are yapps and pygtk. Yapps has not been updated since 2014, and is considered deprecated on every platform I know of, and pygtk strictly stats that it has never and will never support python3... I looked around and it looks like the tools that use pygtk might be able to use PyGObject. I have no idea how much work that will entail, or if it is feasible. Regardless, it is likely a big task. As for yapps, I have included a modified deprecated ebuild so it can build out of the box as well.

The PyGTK to PyGObject issue hopefully will not be a big problem going much further, as it is going to be solved exactly that way - by replacing the aging PyGTK with PyGObject and switch from GTK to GTK in upstream of EMCApplication code.

The YAPPS is going to be a bigger problem. It is used in Machinekit-HAL code to generate the instcomp and comp executables (which in turn generate the compilable code from .icomp and .comp) - thus are a pretty integral part. The YAPPS is still available from PYPI and probably will be for the foreseeable future. So the build could be doctored to use the virtual environment stage and generate the Python code using standard Python packaging. That way, you can take advantage of both worlds - the system-wide distro specific packaging and Python specific packages limited to a certain venv (best as an extending venv).

BTW, are diffs sufficient at this time, or do you need proper pull requests? Not sure I will have enough to make it worth folks while, but who knows.

Thank you, diffs will suffice for now.

ebo commented 2 years ago

While trying to sort out issues in the quick start guide #363, I installed and built several versions of buster and bullseye on my machinekit-hal repository, and could not get any of them to work all the way through to getting any of the GUIs functional. Are any of them working with the machinekit-hal? If so, can you post pointers to the documentation or at least a few instructions here to get started? As far as I can tell nothing has been deployable for the last 2+ YEARS. I just spent most of my free time over the llast 3 weeks trying to get this up and running, and I am almost to the point of quitting altogether. I do want to see this work, but I am well into diminishing returns.