A peculiar and fun Forth like compiler targeting bash functions, recently made available.

Bushmills commented 2 years ago

As I like Forth to be fun too, apart from its practical aspects, I've been enjoying Forth written in lesser likely languages, from scripting languages to editor macros to mobile device "automation" apps. Here's my recent concoction, named yoda which is now at the point of starting to even become somewhat useful, with the gravest flaws and problems now eliminated. It's a Forth-like implementation not based on a virtual machine, but compiling to bash functions. These compiled functions execute in the same context as the compiler responsible for generating them, which means that the compiler won't compile to a file which is then loaded and executed, but to the environment wherein it is running itself, very much like you're familiar with from about any Forth interpreter. It features reasonable code transparency, by providing the words to access and view the source compiled words have been compiled from, as well as the generated code. It also features built-in word description lookup capabilities, aiming to lower the threshold to fledgling Forth users. But in fact I shouldn't call this a "Forth", as it's merely "Forth like", as I didn't bother about trying to adhere accurately to one or several of the available standards, which is partly due to the host platform (bash) making it somewhat hard. Example: double cell integers, no "real" memory access, code space separated from (virtual) memory space, a return stack which doesn't live up to its name. However, it's up there for inspecting, toying with it and hopefully enjoying it.
About half of the words it provides has by now been described ("documented", sort of)

Requirements: a computer running bash (that means, probably some Unixoid system), which has coreutils ~~and sed~~ installed. Additionally, a text editor would be nice too. (sed dependency has been dropped)

MitchBradley commented 2 years ago

This is amusing to me because just yesterday I was thinking how funny it would be to implement Forth in BASH. There is no way I would have done it though.

Bushmills commented 2 years ago

Here's another Forth in bash, but this one is virtual machine based, and therefore a tad slow: bashforth This is older stuff, close to 20 years ago now that I enjoyed coding this. yoda is faster by a factor of about 30 to 50.

MitchBradley commented 2 years ago

About 30 years ago I wrote Forth in PostScript in one page of code.

Bushmills commented 2 years ago

At a point I attempted to code a Forth in Brainfck :) I got to the point of working random memory access, then I gave up on the attempt. That was about the time when I then switched to a Perl implementation, going the lazy route :) That implementation is on my page of my projects on github too, btw, but the completely incomplete Brainfck version I eradicated, no trace left of it.

MitchBradley commented 2 years ago

I was mentoring a high school student a few years ago and gave him a quick lesson in Forth. A couple of weeks later he came to me with an implementation in Haskell. He was having trouble with implementing data space but we figured out a way to do it by forking the workspace on every store operation. Hideously inefficient, but hey.

Bushmills commented 2 years ago

mentioning Haskell, I was wondering what the most unlikely language used to implement Forth in was, thinking of a handful of code golfing languages like Jelly, Vyxal, and also Husk which took some inspiration from Haskell. My personal favourite may be a Forth running on a set of pneumatic or hydraulic valves as host CPU.

catb0t commented 2 years ago

This is great! Definitely one of the most unusual Forth implementations I've come across.

niclash commented 2 years ago

Here is another implementation challenge; Minecraft Redstones.

Cheers Niclas

On 2022-01-09 03:47, Cat Stevens wrote:

This is great! Definitely one of the most unusual Forth implementations I've come across.

-- Reply to this email directly, view it on GitHub [1], or unsubscribe [2]. Triage notifications on the go with GitHub Mobile for iOS [3] or Android [4]. You are receiving this because you are subscribed to this thread.Message ID: @.***>

Links:

[1] https://github.com/ForthHub/discussion/issues/110#issuecomment-1008217377 [2] https://github.com/notifications/unsubscribe-auth/AAA2BROKBXALVES4Y6DGWCLUVDZLFANCNFSM5LP5BV3A [3] https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 [4] https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub

Bushmills commented 2 years ago

I got the wiki running and a handful of pages populated, aiding with the description effort of everything related to yoda

ruv commented 2 years ago

@Bushmills, I have looked at the differences, and I wonder why not use just another name when a word behaves differently?

For example, you could rename your variants:

parse to parse$ (since it places a string on the string stack)
immediate to compile-only (since your variant actually works as the latter one in many Forth systems)
- and define immediate to copy a header to the compiler vocabulary
abort to (throw)
- and define throw as : throw ( x|0 -- ) dup if (throw) then drop ;
- and define abort as : abort -1 (throw) ;
?abort to ?throw (probably, for compliance, it should not throw code 0 by any means)
string patter "ccc" to "ccc"$ or $"ccc" (to indicate that the string is placed on the string stack)
- and make the string pattern "ccc" to place (c-addr u) on the data stack only
<# # #s #> to <// // //s //> or <⌘ ⌘ ⌘s ⌘> (for pictured numeric output of single-cell numbers).

Bushmills commented 2 years ago

@ruv, thank you for your thoughts on standards compliance. This is valuable and very much appreciated input! Renaming parse to parse$ makes entirely sense, and I will follow that suggestion. The current naming is due to parsing preceding strings in yoda and me not reconsidering the name choice of parse once strings were implemented. The general answer to "why not use just another name" is that a system is either compliant or not compliant - an "almost" compliant system is still a non-compliant system, and given that I'm unlikely to manage "full" compliancy, I didn't bother much about eliminating single aspects of non-compliancy, as with the examples of same naming for different behaviour. With some, such as with throw, implementation is still pending, which is why I wouldn't, even temporarily, want to use these words up now already. I rather change abort once throw exists, letting it take advantage if it, and consider abort for the time being a stop-gap measure. About changing string pattern, I'm hesitant. Mostly due to wanting to avoid c-addr u type strings altogether, or as much as possible: dealing with that type of strings - each char stored as it's ASCII value in a single array entry - is rather inefficient in bash, so I want yoda to default to string stack for strings related operations, and only unpack vs pack them when it's inevitable. Using "these type of string" just reads more naturally in this context, as I don't have plans to implement s" for c-addr u type strings. I may - possibly - think of adding a ." string pattern for output, but as there's no storing of strings involved, there should exist no need to identify by naming what kind of string was output.
Your immediate vs compile-only suggestion I will consider. yoda used a precedence flag before, so it was easy to keep the immediate word where it was used, while only changing it's implementation. I'm undecided with regards to this choice, and that's when I like to listen to opinions others may have.

Bushmills commented 2 years ago

Just another quick additon on the abort issue: yoda error handling hasn't been finalized: for proper error handling I want real warm start capability - which I haven't at this point. Currently, errors are partly signalled back through nested words, in the hope of reaching from/evaluate or even quit eventually. This is messy and not to my liking, and this also affects the preliminary status of error/abort/throw and related words. Probably there can't be a definitive implementation before knowing the best way how do deal with the occasional need of warm starting the system.

MitchBradley commented 2 years ago

I too am a big fan of changing the name if the detailed semantics change. Otherwise anyone else trying to read the code can become utterly confused. And if you do decide to try and implement the standard version, you can do so without having to rewrite old code that uses the variant. My mantra: Names Are Cheap

Bushmills commented 2 years ago

Valid points you have there, @MitchBradley. You guys make so much sense here.

Bushmills commented 2 years ago

I've followed your valuable suggestions re word naming, and made compiled from immediate. For consistency, former - misnomer - interactive is now interpreted. Thank you!

MitchBradley commented 2 years ago

I am glad that the suggestions made sense to you. It can be difficult to come up with good new names, but reusing an old name that has an established meaning usually causes problems going forward.

Bushmills commented 2 years ago

As my use of yoda is in parts also as experimentation vehicle, here's one of the observations gained from it, which may be worthy of consideration in other Forths and Forth-alikes: delayed headers creation. These do away with the commonly implemented hide/reveal construct, rendering it unnecessary and removing yet another header flag (if implementation of hide/reveal uses those). Therefore has delayed headers creation the potential to complement replacement of precedence header flag.

What I'm doing for delaying headers creation is to leave it to semicolon to create the headers of colon words. Should compilation fail, no header will be created at all. Should a header by the same name be referenced during compilation, as it happens when compiling the former version of a word into a redefinition of it, the new header isn't found because it hasn't been created yet. Reason for delaying headers in yoda is actually a different one, so the effects on avoiding self-referencing during compilation are more a side effect, a byproduct, but one of a kind I consider valuable enough to think of utilising such an approach for this goal alone already: get rid of hide/reveal.

pebhidecs commented 2 years ago

On 07/02/2022 at 10:24 AM, "Bushmills" @.***> wrote:

As my use of yoda is in parts also as experimentation vehicle, here's one of the observations gained from it, which may be worthy of consideration in other Forths and Forth-alikes: delayed headers creation. These do away with the commonly implemented hide/reveal construct, rendering it unnecessary and removing yet another header flag (if implementation of hide/reveal uses those). Therefore has delayed headers creation the potential to complement replacement of precedence header flag.

What I'm doing for delaying headers creation is to leave it to semicolon to create the headers of colon words. Should compilation fail, no header will be created at all. Should a header by the same name be referenced during compilation, as it happens when compiling the former version of a word into a redefinition of it, the new header isn't found because it hasn't been created yet. Reason for delaying headers in yoda is actually a different one, so the effects on avoiding self-referencing during compilation are more a side effect, a byproduct, but one of a kind I consider valuable enough to think of utilising such an approach for this goal alone already: get rid of hide/reveal.

There was a fig-version called Split-Forth where the headers were kept very separate from the code. The idea was that the headers could be eliminated when the application was complete. Sounds like you are aiming at something similar.

Regards

Paul E. Bennett IEng MIET Systems Engineer Lunar Mission One Ambassador --

Paul E. Bennett IEng MIET..... Forth based HIDECS Consultancy............. Mob: +44 (0)7811-639972 Going Forth Safely ..... EBA. www.electric-boat-association.org.uk..

Bushmills commented 2 years ago

Hi Paul, No, not really. Separating heads is what I tend to do in about any of my Forths already anyway. With Forths written in such interpreted and scripting kind of languages, it comes almost automatically to do so, as one is inclined to use already existing data structures like arrays for headers, rather than unpacking names into usually virtualised memory. OTOH do assembly implementations, especially for smaller controllers, benefit by being able to remove all headers, freeing up the space they occupy. I liked vocabulary growing from the end of memory towards lower addresses with each new header added. Also in case of yoda are headers already separately stored, in an array. But delayed creation of headers is an independent scheme, albeit it probably depends on headers being separated (or place them at a funny location, like, behind the body of a word)

Bushmills commented 2 years ago

yoda is a spin-off of another experimentation platform, which was mostly used for testing ways to do constant expression folding, which is why words weren't compiled incrementally, but code - and pseudo-code - was buffered for post-processing, triggered by semicolon. yoda inherited this postprocessing, but not the CEF optimisation. With this already in place, delegation of header creation to this post-processing phase was then only a small step.

ruv commented 2 years ago

What I'm doing for delaying headers creation is to leave it to semicolon to create the headers of colon words.

I usually do it as well, using the following words:

relate-wordlist ( xt sd.name wid -- )
naming ( sd.name xt -- )

(where sd.name is c-addr u pair)

I want to find better names for these words (especially for latter one).

Bushmills commented 2 years ago

@ruv, what are your experiences with this approach of reversing the order of compilation and header creation? Any problem points you have encountered as far? My biggest issue with this is currently this: every time a header is created, the file handle and line number of the file it was loaded from is recorded, for easily accessing the word source for editing or viewing. At this point the recorded source location of colon words is the line carrying the semicolon, not the colon. Not horribly complicated to fix, and more a cosmetic issue than a real problem. Other than that I can't think of anything on the negative side. recurse I had to fix, but that was done needing only very little effort. last @ is likely to produce an unexpected result when executed during compilation, but I like to hide such interna from user code anyway. header creation in my case is a pseudo-op, inserted into the instruction stream, with word name as argument.

ruv commented 2 years ago

The general answer to "why not use just another name" is that a system is either compliant or not compliant

Actually, a standard system may have different degrees of compliance (see 5.1.1 System compliance).

And it's easy to make a system compliant: it's enough to provide the Core word set only (see 3 Usage requirements). Concerning other standard words, a word should be either provided and compliant, or not provided.

Character strings

due to wanting to avoid c-addr u type strings altogether, or as much as possible: dealing with that type of strings - each char stored as it's ASCII value in a single array entry - is rather inefficient in bash,

It looks like Forth in Bash is not about efficient at all ;)

Standard character strings can be provided for compatibility only. A Forth system may use any other representation of strings for its internal use and/or in additional APIs.

Bushmills commented 2 years ago

"It looks like Forth in Bash is not about efficient at all ;)" - The more the reason to not substantially slow it down even more if it can be avoided - it may just make the difference between "usable" and "unusable"

ruv commented 2 years ago

what are your experiences with this approach of reversing the order of compilation and header creation? Any problem points you have encountered as far?

Usually the lifetime of a parsed string continues until the next refill, and in such a case you need to save the string containing the name of a word somewhere up to ending compilation of the word. As an option, a header can be created at once, but appending into the compilation word list can be delayed.

There is no such a problem if a parsed string lifetime continues until the whole file is translated (as it was in one my case).

My biggest issue with this is currently this: every time a header is created, the file handle and line number of the file it was loaded from is recorded, for easily accessing the word source for editing or viewing. At this point the recorded source location of colon words is the line carrying the semicolon, not the colon.

Then probably the line number should be taken just before start compilation of the definition. I would associate this line number with xt to provide this information for anonymous definitions too.

recurse I had to fix, but that was done needing only very little effort.

recurse doesn't depend on the header, it only needs the xt of the current definition. I involve a word germ that returns this xt, and recurse is simple as:

: recurse ( -- ) germ compile, ; immediate

See a full example in my gist. It also shows one way how to deal with the name and create the header after end of compilation of the definition.

Bushmills commented 2 years ago

Then probably the line number should be taken just before start compilation of the definition.

Something similar I'm now doing - defining words inject another pseudo op into instruction stream, with file handle and line number as arguments. Source location information is then saved from these arguments, instead of produced when header is created.

recurse doesn't depend on the header, it only needs the xt of the current definition.

colon words in yoda don't have bodies. they also don't have execution tokens in the common sense. In fact, there are neither name- nor code- nor parameter field addresses, and here produces initially an address below 10 (for a handful of variables). What those use as xt is a numeric portion of the function name associated with a word name, which is entirely unrelated to any memory address, therefore is referencing a word internally not based on an xt. xts are only "pretend-xts" to make words like ', execute and the like functional.
The central reference to a word is actually the word name, often used as hash key into an array, As a consequence did recurse depend on the header.

Chances are that not everything what applies to other Forth- and Forthlike systems is directly applicable to yoda. My recurse looks now like this:
code "${functionname_prefix}_$((nextname))", resulting in code like screenshot decmpiled recursion

ruv commented 2 years ago

In fact, there are neither name- nor code- nor parameter field addresses

It's OK. All these artifacts are implementation details that are under the hood. The standard is a quite high-level abstraction that hides all such details.

What those use as xt is a numeric portion of the function name associated with a word name, which is entirely unrelated to any memory address,

An execution token is not an address. It's an unspecified cell that only identifies execution semantics, and nothing more (see also Data types).

1365 is a perfect execution token in your example above.

As we can see, Tick (') returns xt with no doubt, and execute performs the corresponding execution semantics:

: bar 123 . ;
' bar execute \ prints 123

What is missed is the compile, word. It can be defined as follows:

primitive 'compile,' 'code "${header_code}_${s[sp--]}"' ;

Now recurse can be defined on the Forth level as:

: compileonly immediate ; \ compat
: germ ( -- xt ) last @ ;
: recurse germ compile, ; compileonly

BTW, even the core s" can be defined in your Forth as:

: lit, ( x -- ) ['] literal execute ;
: s" [char] " parse$ here unpack$ here over allot lit, lit, ; compileonly

Bushmills commented 2 years ago

Almost ... but, this wouldn't have worked as you wrote it in versions which postpone headers. It seems that you're basing this on a version of yoda from before header creation was postponed. last was updated by header, which was fine as long as headers were created before code was compiled. When order was reversed, last wasn't updated prior to compiled code, and pointed to another word than the most recently defined one when read during compilation.
This has been changed now (version 0.6.2) and has been put online only a few minutes ago - that quirk was mentioned in an earlier post in this thread but not deemed important enough to fix this quickly. As your code examples rely on proper contents of last, I did the fix and upload of the corrected (hopefully) version now.
It seems you took a good look at yoda, given your aptness of dealing with its pecularities.

ruv commented 2 years ago

As your code examples rely on proper contents of last, I did the fix and upload of the corrected (hopefully) version now.

I relied on last just as an easy solution for an illustrative purpose only (yes, it worked in an earlier version).

Actually, germ should be proper implemented to return the xt for the current definition only (the current definition is the definition whose compilation has been started most recently but not yet ended).

When you provide literal, compile, (and/or postpone), you probably would want to throw an exception if the user try to compile something via these words when the current definition is absent. The last cannot help with this. Also, if you provide quotations, a definition can be the current definition several times. The conception of last cannot proper reflect this idea too.

A proper implementation (and handling) of germ solves all these problems.

Bushmills commented 2 years ago

compile, (and/or postpone), you probably would want to throw an exception

yoda supports forward references. for compile, and postpone, the call to a function can be compiled while the function doesn't exist yet. When defined later, the name of the compiled but declared "still missing" function will be used for naming the resolved word.
This is possible by either automatically created forward references - those are by default disabled, look at +f and -f "convenience" switches to turn those on and off), or manually supported, by declaring a not yet existing word as needed through need word which creates the word header and function name for assigning when the word gets resolved anytime later.
It should only make sense to allow postpone, compile et al to benefit from this convenience too.
Forward references in compiling are slightly different from immediately resolving words when referenced during execution (which is, by default enabled - convenience switch +i and -i, also indicated by flags display on statusline by capital vs lower case letters). Latter do throw an error if not resolvable right away, so calling those "forward references" would actually be a misnomer, even though they share code with compile time ("real") forward references.

ruv commented 2 years ago

yoda supports forward references. for compile, and postpone, the call to a function can be compiled while the function doesn't exist yet.

It nothing to do with the problem I'm talking about.

Take a look:

123 lit,
: foo 456 . ; ' foo compile,

in these both lines I append something to the current definition when it's effectively absent. If you want to make your system to throw an exception in such a case, you need to maintain information about the current definition and absence of it. And last cannot help you on that.

Concerning forward references. It seems this conception is applicable to ordinary words only. And then a system should throw an exception if the word that was forward referenced, later is defined as not ordinary (e.g. if it's compileonly).

Bushmills commented 2 years ago

I admit that I don't quite understand. First line you may be referring to lit, and assume that it doesn't exist (it actually doesn't). As lit, is interpreted, it won't be forward referenced. Either it can be resolved and loaded from library immediately (which it can't, because there's no lit, in the library), in which case it will be executed, or it can't, causing an error to be thrown. On second line you may refer to compile, and the same as for lit, applies. In neither case would forward references be involved, and neither line bears any relationship with last. What did I miss?

Here I've added stub/pretend lit, and compile, to library, source at bottom of screenshot. First, by ticking, I demonstrate that neither lit, nor compile, exist. Then I use them:

screenshot dummp lit, and compile, instant resolve

(ed: I was tempted to let tick resolve from library, but decided against it because I use tick too often to check whether a word is present - in those cases I don't want the system to take action towards causing them to become present. That'd be like quantum resolving - the test affecting its own result)

ruv commented 2 years ago

First line you may be referring to lit, and assume that it doesn't exist

Sorry, I meant the words defined in a preceding message:

primitive 'compile,' 'code "${header_code}_${s[sp--]}"' ;

: lit, ( x -- ) ['] literal execute ;

compile, is a standard word, lit, can be defined in a standard way using postpone.

Well, even without any additional definitions, my point can be illustrated by system-specific code:

123 ' literal execute

Bushmills commented 2 years ago

I see now. Your assumption that those compile anything is possibly why I didn't understand: those don't compile anything. There is no "append" to a word, and no memory space "between" words. Once compilation of a word has been completed, you can think of it as an isolated blackbox: The word is stored outside of the scope of any memory accessing word. The only way to add anything to it is by redefining it.
While lit, through literal, will inject a pseudo-op for pushing a cell into instruction stream, that push will never be converted to actual code, nor appended to any already completed word. Instead it will, along with the value it attempts to append, suffer from being eradicated from existence when the next header creation pseudo-op is inserted. At that point could an error be thrown, as header creation pseudo op injection is capable of noticing that some dangling compilation attempts are floating around. But I preferred to silently ignore and discard those.
Both lit, and compile, have only effect when the subjects they attempt to compile are inserted into the instruction stream when it will actually be processed.

Bushmills commented 2 years ago

This, btw, is also the reason why implementing headerless words is no triviality in yoda, and why I for now settled for a dummy header which gets discarded instead: Same thing applies (for a simple "real" headerless solution): without taking some specific actions will the headerless code simply vanish.

ruv commented 2 years ago

Your assumption that those compile anything is possibly why I didn't understand

No. My assumption that such compilation is incorrect and a Forth system may throw an exception on that. And then a question is how to implement that the system throws an exception when a program tries to compile anything when the current definition is absent. (And in the comment above I mentioned that germ helps to solve namely this problem)

Using the word germ this problem can be conceptually solved as follows:

: ?germ ( -- ) germ 0= abort" Error: the current definition is absent" ;
: lit, ( x -- ) ?germ lit, ;
: compile, ( xt -- )  ?germ compile, ;

The word germ (as I define it) returns the xt of the current definition, or 0 if such a definition is absent. "Current definition" is a formal term that is defined in the section 2.1 Definitions of terms of the standard.

At that point could an error be thrown, as header creation pseudo op injection is capable of noticing that some dangling compilation attempts are floating around.

Yes, but it will be a delayed exception: not when an incorrect operation occurs, but when a correct operation occurs after an incorrect one.

But I preferred to silently ignore and discard those.

Yes, in this case it's better to ignore the error than report it delayed.

the reason why implementing headerless words is no triviality in yoda,

A Forth definition can be named or nameless (anonymous). Whether it has any header or not — is hidden under the hood; the standard doesn't specify any headers at all.

So I don't see what makes not trivial for yoda to support nameless definitions (i.e. to implement the :noname word).

Bushmills commented 2 years ago

It seems that standard doesn't require last, latest or a sometimes seen lastxt - what's the standard complying way to obtain the xt of the word currently under construction?

ruv commented 2 years ago

It seems that standard doesn't require last, latest or a sometimes seen lastxt

Yes. Such a word is not standardized yet. Some proposals were discussed in comp.lang.forth though. My point was that xt of the most recently created word, and xt of the current definition (i.e. that is under construction) should not be mixed.

what's the standard complying way to obtain the xt of the word currently under construction?

There is no such a way for a named definition.

For an anonymous definition, which is defined via :noname, the xt is on the stack due to the effect ( -- xt ) ( C: -- colon-sys ). So it's possible to obtain it (taking into account the the size of colon-sys on the data stack is unknown).

For example, :noname and semicolon can be redefined to support the germ method as follows:

variable _germ  : germ ( -- xt|0 ) _germ @ ;
: :noname depth >r :noname depth r> - 1- pick _germ ! ;
: ; postpone ; _germ 0! ; immediate

Concerning etymology, germ is connected with the words conceive and birth:

\ conceive ( C: -- colon-sys )
\ germ ( -- xt|0 )
\ birth ( C: colon-sys -- ) ( -- xt )

See my gist for description, rationale, and an implementation example.

Bushmills commented 2 years ago

After some musing I now arrived at this position:
Because standard not specifying anything related to last and family, this opens another violation-free path. Not that I try to adhere as closely as possible to standard, but nevertheless do I want to refrain from blatantly violating it (when it can be avoided):

last isn't exposed any longer.
last$ is provided instead, simply pushing name of word under construction to string stack.
whether that's a good naming choice, in view of possibly ambiguity re "word under construction" and "most recently defined word" remains to be seen. I'll change name if necessary.

Bushmills commented 2 years ago

Revisiting the matter of single vs double length numbers with pictured number conversion words #, #s and #>, and the three routes offered, as there are renaming those vs squeezing them into becoming standard compliant vs. implementing and letting # use um/mod:
Could those be called "compliant" if they formally follow the described stack effects, by operating on double length numbers, but omit to take the upper part of a double into account for conversion, instead simply ignoring it, so that the largest number produced by pictured number conversion will still correspond to a cell sized number?

MitchBradley commented 2 years ago

They are not compliant because they produce a different result. Is "+" compliant if 2 2 + gives 3? The whole point of a standard is so that a standard program gives the same results on different systems.

Either implement the standard or do not.

ruv commented 2 years ago

Could those be called "compliant" if they formally follow the described stack effects, by operating on double length numbers, but omit to take the upper part of a double into account for conversion, instead simply ignoring it,

It's very unexpected, error-prone and frustrating for a user.

If you don't want to handle the most significant part, then, if this part is nonzero, throw an exception with a clear error message, but don't silently ignore it. Also, document it as an environmental restriction.

Also take into account that um/mod can be implemented in a high level (as well as all other double number arithmetic), see for example bigmath.f from SwiftForth (the actual code is in public domain).

If performance is a bottleneck, I would use an alternative set of words for single numbers pictured-numeric output (e.g. <⌘ ⌘ ⌘s ⌘>, as I mentioned before) for inner purposes, and probably provide the double numbers arithmetic and standard pictured-numeric output words in an external library.

MitchBradley commented 2 years ago

My various Forth systems have used u#, u#s, and u#> for single numbers since the dawn of ANS Forth. The standard double-number variants are, of course, fully supported.

ruv commented 2 years ago

My various Forth systems have used u#, u#s, and u#> for single numbers

Excellent naming. I like it! And the word <# is the same for both cases.

Bushmills commented 2 years ago

Fully functional pictured number conversion words working on double length numbers in combination with single length number outputting words still using the words for single length pictured number conversion, after which the headers (and in parts, their code, when inlined) of those words were removed sounds like the best deal, which is why I went this path now. Given that those headers are trashed anyway after use, the naming wasn't really that relevant - i went for <x x xs x> until the array of single length output words were compiled, after having initially considered to use <_ _s _> Thank you for being so helpful with aiding me to make up my mind.

Bushmills commented 2 years ago

What do you suggest I should do with unused, required by core, but inherently hard to implement such that the produced result bears any relationship to real world data?

ruv commented 2 years ago

The word unused belongs to the Core extension word set. This word set is optional, so a standard system is allowed to not provide the word unused at all.

OTOH, you can define unused as follows:

10 1024 * 1024 *  constant dataspace-size \ 10 MiB (formally, address units), just for example
: unused ( -- u )
   dataspace-size here - 0 max
;

Just choose some appropriate value for dataspace-size for which the system is still operable (when almost all this space is used).

Bushmills commented 2 years ago

Indeed, that's an effective way to solve it. My mind was somehow set to providing "real" data, ignoring that an arbitrarily designated value is just as real.

ForthHub / discussion

A peculiar and fun Forth like compiler targeting bash functions, recently made available. #110

Links:

Character strings