Open mamccollum opened 1 year ago
With exception with importing directly from Illumos, I think it could be a good idea to implement/port missing commands. I mean, my idea is to try to make this package not-so-CDDL'd, since it could cause annoyance later on to folks trying to use this in an embedded environment.
I'm on my way to port write
(1) from UNIX v7, so it will pair up with mesg
(1), but I haven't done a lot of work on it yet.
Don't misunderstand me, please, but I think it would be more helpful (from a user point of view) to have default utilities with the new POSIX 2008 standard implemented. We have started doing it when we fixed that bug in default rm
.
But, anyway, what tools do you propose to port? Maybe there's something that I'm missing out and that could make a difference.
By the way, about porting from OpenBSD, I had an idea of creating an "ucblib" package that could be used as a libbsd alternative. Not inside Heirloom NG, though.
(...) my idea is to try to make this package not-so-CDDL'd, since it could cause annoyance later on to folks trying to use this in an embedded environment. Don't misunderstand me, please, but I think it would be more helpful (...) to have default utilities with the new POSIX 2008 standard implemented.
Alright, I understand! I'm not offended at all, and I understand the licensing issues & the need for updating standards. Is there any explicit tasks that should be done on that side? I know there's the docs for 2008 and 2017 and I could check for the behavior between expected in the standard and the results of Heirloom.
But, anyway, what tools do you propose to port?
I was double-checking with the tools in the GNU coreutils (not saying at all to port from there, but I was looking for compatibility with it) and I could have sworn there was more that I wanted to port, but all I could find that was actually missing that wasn't 100% a GNUism was seq
and rmdir
, which are relatively easy to develop.
Also on the topic of standards like POSIX -- is there any test suite I could/should use to see how compatible Heirloom NG's utils are in comparison to other stuff? I know there's the Open POSIX test suite, but it looks to not have been updated in at least a decade, if not another 5 years after that. I've also heard of the GNU coreutils test suite that the folks over there use to test their software.
Is there any explicit tasks that should be done on that side?
Nothing that I can say at a first overview, besides the rm
fix that I've said about before.
I know there's the docs for 2008 and 2017 and I could check for the behavior between expected in the standard and the results of Heirloom.
These will do the trick. 😄
that was actually missing that wasn't 100% a GNUism was
seq
andrmdir
, which are relatively easy to develop.
Well, I could argue that one may use {1..X}
or a C-style for loop instead of seq 1 X
for counting, but that only applies in Korn Shell 93, GNU Bourne-Again and Z-Shell, and that there are many people who still writing Shell scripts for POSIX-only environments and depend on seq
.
seq
itself, except for some GNU extensions, is just a matter of taking argv[0]
and argv[1]
and counting between them.
rmdir
shall be just a system call-kiosk command (like readlink
, that I've implemented), so it also shouldn't be hard to implement.
Definitively on the list. 👍🏽
Another two commands that I was thinking about implementing were watch
and timeout
, but my code haven't work so well and I didn't tried to go further since then.
Also on the topic of standards like POSIX -- is there any test suite I could/should use to see how compatible Heirloom NG's utils are in comparison to other stuff?
I was thinking about that recently and... no, there's nothing to test all the utilities at once.
However, sed
has its own set of tests and grep
has some tests described at the NOTES file.
Maybe a sane option would be to port the tests from toybox, since its public domain and it won't have any licence problems with this current source tree.
I was also thinking about testing in GitHub's VMs if it builds on other flavors of BSD and MacOS, it would be useful to this information both here as a table and as a badge on the website.
I was also thinking about testing in GitHub's VMs if it builds on other flavors of BSD and MacOS (...)
vmactions has workflows for building on the BSDs and Solaris even though they're not officially supported (I believe it uses the macOS VM with a VM inside of that to do the trick).
P.S. I once tried building on macOS and there were... several big issues. From a lot of missing headers (can be fixed with macbrew, but it's quite annoying) to similar issues with deprecated functions that OpenBSD had, and even things as basic as APFS not being case-sensitive (unless you specifically configured your Mac's FS otherwise in a rescue CD). I didn't want to create an issue at the time because it's a massive series of issues and I think it's better that we focus on primarily Linux & secondarily the BSDs for now. (Though I'm not the leader, so just take this as friendly advice)
vmactions has workflows for building on the BSDs and Solaris even though they're not officially supported (I believe it uses the macOS VM with a VM inside of that to do the trick).
Yeah, I've been thinking about it! Although it is needed, maybe it's not our priority for now comparing to implementing those new tools.
P.S. I once tried building on macOS and there were... several big issues.
I've imagined...
I didn't want to create an issue at the time because it's a massive series of issues and I think it's better that we focus on primarily Linux & secondarily the BSDs for now.
Sure, I agree with you.
Also, I made a commit in my fork of the repository that changes the readlink makefile to where it will clean up the UCB binaries as make mrproper
was leaving them behind. Should I just leave that commit there and make a PR later when I make more changes to my fork?
Also, I made a commit in my fork of the repository that changes the readlink makefile to where it will clean up the UCB binaries as
make mrproper
was leaving them behind.
Damn it, how could I have missed that out?
Should I just leave that commit there and make a PR later when I make more changes to my fork?
For me it seems O.k.
Understood. I'll look into the test suites from toybox and get back to you. 😄
Understood. I'll look into the test suites from toybox and get back to you. smile
Good! And I'll be implementing the seq
command.
Merci![^1]
[^1]: I'm not sure if it's still being used in English, but I've learnt that "Merci" could be used as a thank in English.
One more thing -- chgrp
is also missing. Forgot to mention that, sorry.
One more thing --
chgrp
is also missing. Forgot to mention that, sorry.
Actually, it's not.
Many commands on Heirloom are supplied by symbolic/hard links that change argv[0]
.
For instance, chgrp
is a link to chown
, dfspace
is a link to df
etc, take a look at the manual pages on the website. The commands that don't have a description and/or an own directory are supplied per symbolic/hard links.
Oh wow, I didn't notice that. Thanks for informing me!
O.k., the initial implementation wasn't standard, so I implemented seq
according to the standard and got some new funky bugs.
I don't really know where to go now.
https://github.com/Projeto-Pindorama/heirloom-ng/tree/seq-impl
EDIT: Kind of fixed at #29
Hi, Molly (@mamccollum), good night. How you are?
Just an update: now seq
works just fine --- at least here, not sure if it won't be breaking or misbehaving in other host yet --- and rmdir
already got implemented by Gunnar Ritter back in July 2002, and it also has its own directory here in the source tree.
seq
was harder than I thought, not because of the algorithm, but because of the implementation standard combined with my "just do it"/compact style of programming --- while many got it in more than 90 or 100 lines, I got it in 65 lines of code, counting spaces. There were some hours of debugging in the last day in which I programmed it, along with some segmentation faults.
Another two commands that I was thinking about implementing were
watch
andtimeout
, but my code haven't work so well and I didn't tried to go further since then.
watch
(1) implementation first made at #30, improved at #31.
On my way to timeout
(1).
Hey, that's good to hear. I think a while back (probably around 3 weeks ago now) I was working on a feature here, but I sadly forgot what it was. My fork ended up just turning into a messy disaster and I had to re-start. Is there anything I should assist in working on?
Hey, that's good to hear. I think a while back (probably around 3 weeks ago now) I was working on a feature here, but I sadly forgot what it was. My fork ended up just turning into a messy disaster and I had to re-start. Is there anything I should assist in working on?
Well, just testing and completely porting to OpenBSD (although I think having contributions from more OpenBSD folks would help too).
watch
(1) works surprisingly well for its "100-liner" size --- I risk to say that it's even better, proportionally comparing, than procps-ng's watch
(1) ---, but the title/information header size being smaller than the terminal maximum line width kind of annoys me a little.
About timeout
(1), I couldn't rewrite it yet, more because of it being complex. I'm afraid that, in the end, I end up sourcing it from OpenBSD's source tree and doing some modifications so it fits on Heirloom NG --- like using SVR4-like error reporting via pfmt()
/prerror()
instead of err()
-like functions, etc.
Just realized that du
doesn't return anything if called with just one file.
Just realized that
du
doesn't return anything if called with just one file.
I was busy to correct myself, but nevermind, it's part of the standard.
If I'm not mistaken, only /usr/5bin/posix/du
prints individual files without -h
or -s
per default.
For ones who want to write portable scripts, use always du -s
or even du -hs
.
@mamccollum I was messing around with Heirloom tar
(since it's on my PATH
) and I think I've found a new glitch.
I usually copy folders using tar -cvf
and tar -xvf
in a pipeline, like Plan 9 does, and I've found out that it doesn't extract for some reason, printing "tar: 1 file(s) not extracted".
I'll take a deeper look on it later but, for now, I'll be using Schily's tar as always.
chimera-utils has made a lot of the groundwork for using fBSD coreutils in Linux. We could take some of the patches from there. Also, what about the code from SBASE and UBASE? by suckless.org, its all MIT licensed.
chimera-utils has made a lot of the groundwork for using fBSD coreutils in Linux. We could take some of the patches from there. Also, what about the code from SBASE and UBASE? by suckless.org, its all MIT licensed.
First of all, sorry for the delay on the response.
I think that taking a small part of chimera-utils may be useful for some utilities --- such as write
, that, although having to be compliant with its UNIX v7 version to be par with mesg
, or something to help finish the timeout implementation.
About SBASE/UBASE, I have already taken a look at some of the code, I think that it would be equivalent to copying code examples. It may be useful as a reference, but I think that we can implement more complete utilities. But that's something to consider too.
sbase is POSIX and minimalist. I thought that heirloom strived for that too. Is compatibility with GNU coreutils a wanted feature for the project?
sbase is POSIX and minimalist. I thought that heirloom strived for that too. Is compatibility with GNU coreutils a wanted feature for the project?
You got it a little bit off I what I meant.
Heirloom is meant to be POSIX, if one wants to put it this way, but in the mid-path betwixt simplicity and convenience --- like you would find in some UNIX-compatible system cited at Heirloom's intro
.
Suckless' sbase are more like an example --- similar to what you could find at OpenGroup.org when looking at the POSIX specification --- of what a command should be instead of what it could be.
I don't want Heirloom to be like GNU Coreutilities in this fork, but I also don't want to call just a handful of lines a "complete utility" when it lacks features that are useful for the end user --- and, at the same time, not redundant. A good example about what I mean would be a comparison betwixt procps-ng's watch
, Heirloom NG's and Suckless' ubase one.
procps-ng's is overly complicated, Heirloom NG does its job flawlessly in less than 1/4 of L.O.C. that procps-ng has and Suckless' ubase works as good as a shell script hack done in 2 minutes and a half.
For a matter of honesty, I must say that Suckless {u,s}base may work as well as POSIX specification for enlightening the way to go --- but I wouldn't plain copy the code into Heirloom's source tree just because it's under a compatible licence, at most fork it and enhance it.
I'm quite busy lately, so I can't burn daylight working on #36 for now, even having references instead of writing completely from scratch as you suggested, nor discussing this matter further.
Okay, I think I get it. Then what about taking inspiration/work/code from Toybox? 0BSD licensed, by the same guy that started Busybox, its currently used in Android phones, it also works on MCU-less devices, and its pretty lightweight yet convenient.
Then what about taking inspiration/work/code from Toybox? 0BSD licensed, by the same guy that started Busybox, its currently used in Android phones, it also works on MCU-less devices, and its pretty lightweight yet convenient.
Yeah, that is somewhat the goal: a sane and yet convenient environment. Toybox uses too many internal functions, so porting code directly from it is more difficult than doing a clean-room implementing. NetBSD/OpenBSD's userland is also an inspiration, but we avoid to take code directly from it and try to get how to implement utilities by ourselves.
I have taken your idea of basing some utilities on OpenBSD code when fixing/"filling" Heirloom NG's timeout implementation.
Released 240220 today, I would like a feedback from some of you.
@arthurbacci pointed out that my method for converting the float back to a string is redundant and could lose precision, I have already taken this into consideration to fix in the next release. seq is ridiculously incomplete yet, I think it could at least mimic Research UNIX v8 implementation and also have its "separator" condition fixed for some cases, but I think I can get around with this.
Released 240220 today, I would like a feedback from some of you.
Many fixes are now being addressed at #41.
I would like to complement this issue with the fact that tar
is somehow broken.
For instance, while I was testing Copacabana Linux build system, I noticed that Heirloom's tar isn't extracting tar balls passed for it per a pipeline, responding with this error:
tar: 2 file(s) not extracted
Maybe this can be fixed after some debugging, just taking notes here if someone got to it before me.
Please open an issue for tar
Please open an issue for tar
That's going to be fun.
Opened a specific issue for tar at #44. cc.: @arthurbacci
@arthurbacci I was thinking, could we borrow the "libutf-8" from Plan 9 for libcommon? Or some other implementation of UTF-8 for C.
I found this page which lists some implementations (including Plan 9's) and drawbacks: https://www.linuxdoc.org/HOWTO/Unicode-HOWTO-6.html
There's also utf8proc by the Julia Programming Language development team, it looks good, and it's also small. https://github.com/JuliaLang/utf8proc
This would be a grand improvement on Heirloom, since we could make other programs UTF-8 compliant too.
I'm saying this mostly because of #52, but we could apply this to ls
, more
/pg
, everything (almost).
@arthurbacci I was thinking, could we borrow the "libutf-8" from Plan 9 for libcommon? Or some other implementation of UTF-8 for C.
I found this page which lists some implementations (including Plan 9's) and drawbacks: https://www.linuxdoc.org/HOWTO/Unicode-HOWTO-6.html
There's also utf8proc by the Julia Programming Language development team, it looks good, and it's also small. https://github.com/JuliaLang/utf8proc
This would be a grand improvement on Heirloom, since we could make other programs UTF-8 compliant too.
I'm saying this mostly because of #52, but we could apply this to
ls
,more
/pg
, everything (almost).
Any way to drop-in in this project? I would've surrender to wchar if I could implement it without getting cryptic memory faults when running fgetwc().
@arthurbacci I was thinking, could we borrow the "libutf-8" from Plan 9 for libcommon? Or some other implementation of UTF-8 for C.
I found this page which lists some implementations (including Plan 9's) and drawbacks: https://www.linuxdoc.org/HOWTO/Unicode-HOWTO-6.html
There's also utf8proc by the Julia Programming Language development team, it looks good, and it's also small. https://github.com/JuliaLang/utf8proc
This would be a grand improvement on Heirloom, since we could make other programs UTF-8 compliant too.
I'm saying this mostly because of #52, but we could apply this to
ls
,more
/pg
, everything (almost).
The LibUTF from Sbase is also really good and tidy: https://git.suckless.org/sbase/files.html But I feel like libgrapheme would be better in the long term, its very maintainable
The LibUTF from Sbase is also really good and tidy: https://git.suckless.org/sbase/files.html But I feel like libgrapheme would be better in the long term, its very maintainable
That's a good suggestion too, but maybe we will just stick to libgrapheme. I hope it doesn't change much, so we could just embed it in the code and add some directions at the build system to link it.
Hey there, I had an idea I wanted to share. I understand that future contributions are supposed to be in the Zlib License, & I know that Illumos & the BSDs are NOT under the Zlib License, however I was wondering if on new commands or commands already licensed under CDDL or existing compatible licenses, that we could potentially import commands, features, & more from Illumos and/or the BSDs such as OpenBSD.
Does anyone else believe this has potential? I could work on ensuring it uses libcommon, etc. I understand that we already have some work to do with OpenBSD compatibility, but I believe this could help push some innovation and expansion of the project, albeit at the cost of maintaining more.
Thoughts? (PS: if we do go through with this, should we try to import the git history for the specific files from Illumos, etc.?)