wspace / corpus

The definitive collection of interpreters, compilers, and programs for the Whitespace programming language.
MIT License
28 stars 3 forks source link

New Whitespace project with parser combinators #4

Closed Andreal2000 closed 8 months ago

Andreal2000 commented 8 months ago

Hi, i have implemented a Whitespace evaluator in Scala using the parser combinator.

I tested it using some code found on vii5ard.github.io/whitespace/. I would like to add my interpreter to this collection since i don't find another Scala project that use this technique.

Heres the link to my work.

thaliaarchi commented 8 months ago

Hey! Sure we can add your project! Just a couple of comments on it.

I see that it's just a single file. Do you have the project files to go with it? Things like build.sbt and whatever else is needed to make it reproducable? I've found that single-file projects are more difficult to figure out how to build, unless I'm already well familiar with the language's tooling.

Besides that, it looks like it could be easily run on the command line.

If you want to give it a shot, you're welcome to submit a PR. If not, I could add it today. Let me know if you have any questions.

thaliaarchi commented 8 months ago

You're now on the list! I've added your project, which you can now see at scala/andreal2000.

I tried to build it with SBT, but ran into issues. I've never had much luck using SBT.

Here's what I tried. I created a template project with sbt new scala/scala3.g8, placed your WhitespaceEvaluator.scala at src/main/scala, deleted Main.scala, and enabled the assembly plugin by putting addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "2.1.5") in project/plugins.sbt. Then when I ran sbt assembly, I got a bunch of variable and import resolution errors.

If you update your repo to be a full SBT project, then I'll add a Dockerfile, so it can be built like the others.

Andreal2000 commented 8 months ago

Sorry for the poor documentation.

I never used sbt, usually i compile and run manually from command line. I will create a full repo with sbt.

Are there any requirements other than sbt to make it run on Docker?

Currently my program accept as args the path to the files to run and than execute one file after the other, it's ok or there are other specification about how to run the files?

thaliaarchi commented 8 months ago

Really, I just need a way to build and run it reproducibly. If that's with sbt, great. If it doesn't use a build system, also great. If you can get me a list of shell commands that will build and run it, including whatever dependencies, then I should be able to get it working. My only work with Scala has been with the projects included here, which all use sbt, so that's why I was talking about sbt.

For Scala, and any other language that uses the JVM, I have a two-stage process with building separate from execution. A Docker image first builds an jar with all its dependencies (like with sbt assembly), then those jars are all run within one image, which has only a JVM and no development dependencies. You can see that image at Dockerfile.jars. If your approach produces a jar, it also needs the dependencies included.

Your main looks fine. Everyone does their argument parsing differently, so there's no standard. I just record how it's done in project.json.

Andreal2000 commented 8 months ago

I studied a little bit this repository and how you build the Docker for the JVM projects. Now i will start move my project from Gist to a real repo with sbt, when finished i will come back and reply to this issue.

thaliaarchi commented 8 months ago

By the way, I figured out what was wrong with my previous sbt setup: I didn't list the parser combinator dependency in build.sbt. I got caught up with making some changes to docker compose, that I forgot to push it. Sorry about that.

I have a repo up at github.com/wspace/andreal2000-scala, which I can transfer to your user, if you'd like. I used your gist as the base, since it's just a regular git repository, that can be cloned. And if you want to do it all yourself, I'll just delete it once yours is working.

Andreal2000 commented 8 months ago

Nice, now you can keep it like this. When i finish my repo we can change it. I will try to add some test and documentation to make it better.

Andreal2000 commented 8 months ago

I finished my repo! Here the link I used your project as base (thank you) I see that in your json you written the name of the function i used, i have changed a name cause typo heapRetrive -> heapRetrieve. If there are some problem tell me and i will try to fix it.

thaliaarchi commented 8 months ago

Excellent! I've now updated the entry here for your project and it successfully builds with all the other JVM projects :).

I use the function names for a generated report in assembly.md, which is useful when designing a new Whitespace assembly (wsa) dialect or parsing existing ones, to see what names people have used for each instruction. When projects don't have a wsa dialect, I add the names used for the instruction functions or enum (which is why yours is marked as "usage": ["enum"]). Those tend to be less useful for parsing wsa, but I still track the data. Eventually, I'll get back to my universal wsa parser project.

thaliaarchi commented 8 months ago

Since the submodule clones a copy of your repo, it lets me automatically track updates to upstream repos. Periodically, I update all the submodules and review the changes, to see if there's anything I need to update here. So, I'll get changes to your repo, when I do maintenance work here.

If you have anything else regarding your project, feel free to comment here.

Andreal2000 commented 8 months ago

Ah yes.

  1. I found some .wsa file but i thought that are only used to document/explain a .ws file, how are really used ? And what is the "universal wsa parser project" ?
  2. Are you the original creator of Whitespace ?
  3. There will be a 0.4 version of Whitespace ?
  4. Why this repository is called corpus ?
  5. The repo you created from my Gist file will be deleted ? (https://github.com/wspace/andreal2000-scala)

Sorry for the long list of question :sweat_smile:

thaliaarchi commented 7 months ago

Sorry for my delayed response. I wasn't sure what to say about #5.

1

Whitespace assembly is a syntax that lots of people use to read and write Whitespace code, to make it more sane. For some people, it's just a pseudo code, as you say, but to other's it's a real language. Whitespace assembly is not anything standardized, and everyone who works with it tends to make their own dialect with subtly incompatible differences.

I personally use the syntax of Whitelips IDE, since it was the first one I ran into when I learned Whitespace and has more features than most Whitespace assembly dialects.

The universal wsa parser I mentioned is something I haven't finished, but am not currently working on. Parts of it have been pushed to my lazy-wspace project, but most of the design is still just in my head, so I think you'd be disappointed with the code. It aims to be able to parse any wsa dialect into a common representation and convert between them.

2

No, I did not create Whitespace. It just kickstarted my interest in compilers, so I have a fondness for it and have put a lot of work into related projects.

Edwin Brady created it with help from Chris Morris and inspiration from Andrew Stribblehill. The original Whitespace homepage has been offline for 9 years, so I wouldn't be surprised if you had never seen it. Edwin writes about the start of Whitespace on the main and explanation pages, and in a 2021 interview where he also talked about his later language Idris. And, speaking of Idris, Whitespace is essentially a reduced version of the abstract machine of what led to Idris, but dressed up in invisible syntax, and he created it during his PhD while he was researching those things.

3

There almost certainly won't be a version 0.4 of Whitespace. Edwin has moved on and there is no official successor. I don't think there is a need for a version 0.4 anyways.

In a related vein, you may be interested in Edwin's own Whitespace interpreter in Idris (here as idris/edwinb-ws-idr). To test Idris' maturity, he wrote a Whitespace interpreter in it, but it just implements version 0.3 and does not aim to replace his original interpreter in Haskell.

4

This project is called the Whitespace Corpus, because it is a collection of all the projects involving Whitespace that I've uncovered. I use “corpus” in a similar sense to a corpus in linguistics. A text corpus collects real-world text in a language, so it can be studied. This collects Whitespace projects, so they can be studied.

I don't make rank projects, and rather than curating and maintaining only the best, it collects all that I can find. I also see it as a historical preservation and archival project. It's a different goal than if it were called “Awesome Whitespace”, for example.

The data here helps in projects like that care about interoperating with many community Whitespace implementations. Another related project I have been meaning to do, is to collect all unique .ws and .wsa programs that people have made and collect them into one repo, with a synthesized git history, that attributes them each to their respective authors. Most people copy their programs from other projects without attribution, so it requires quite a bit of original research and tooling to analyze it.

5

I try to make the data here be as accurate as possible. For example, the date field in project.json has the date a project was started. In this case, it's the first commit date from your gist, since that's the earliest date I have for your project, 2024-01-21 21:29:48 +0100.

When you started your repo, you copied the files from my fork and made a new commit. Unfortunately, that means it doesn't continue the git history and now the earliest date is 2024-01-25 22:38:59 +0100. My fork doesn't have this problem. I started by cloning your gist, so it includes that history.

So, what I've done now is I've grafted your new changes onto the start of the history. This means it has a continuous git history, including the first commit from your gist, my commit adding the sbt project structure, and your newer commits refactoring and polishing.

When hostilefork split rebol-whitespacers off of their whitespacers repo in 2021, I similarly grafted the old history onto the new, so it has all of its history back to 2010, instead of just to 2021. Hopeful this is welcome here too.

Take a look at my fork, and you'll see that the commits include all three repos' histories. Besides replacing where the initial commit came from, the histories are exactly the same (you can verify that by comparing the output of git log --raw --patch --reverse in both repos).

If you like that change, you can force-push it to your repo with these commands, while still keeping your working tree untouched:

cd WhitespaceEvaluator
git remote add wspace https://github.com/wspace/andreal2000-scala
git fetch wspace
git reset --soft wspace/main
git push --force -u origin main
git remote remove wspace

If you're curious, here are the commands I used to graft the histories. git replace replaces a commit (the initial commit, e98b5b5 in your repo) with a replacement commit (my sbt commit 97832fc, in my fork). This makes git tools treat it as if one commit is the other in git log and blame. Then, git filter-repo (which needs to be installed separately) makes the change permanent.

git clone https://github.com/Andreal2000/WhitespaceEvaluator
cd WhitespaceEvaluator
git remote add wspace https://github.com/wspace/andreal2000-scala
git fetch wspace
git replace e98b5b5744051ff49fca267c5506a6cc1edb9bb3 97832fc8875cdab38b139eac9875a1e8ae637654
git filter-repo --force
git remote remove wspace

If you force-push to include the earlier git history, I'd be happy to delete my fork.

All that being said, I'm squabbling about a little detail. This case is nowhere near as significant as many of the 50+ projects, that I've done git transformations and history preservation on. It's just three days. Although I wouldn't be attributed for the sbt commit, it's not a big deal. If you really want me to delete it anyways, I can. I would delete it and replace the link to it with a link to it on the Software Heritage archive.

Oh, and in the mean time, I changed my repo to be a proper GitHub fork of yours, now that you have a repo instead of just a gist. So it clearly more shows that yours is the definitive source.

Let me know if you have any questions.

Andreal2000 commented 7 months ago

Thanks for answering all my questions.

I have updated my repo to include your commit. I'm sorry for deleting your commit, i cloned your repo beacuse i did't know what happen when the original repo of a fork is deleted. I hope that now it's allright. 😃

thaliaarchi commented 7 months ago

I've deleted my fork!

By the way, here's what happens an upstream repo is deleted, but it has forks. GitHub docs say, that one of the forks is chosen to be the new upstream repo—the new root that the other forks are forked from.

Another way to duplicate a repo is to use git clone locally, outside of GitHub. Then you can make an empty repo on GitHub (don't pick a template README or LICENSE), and then push your local repo to that GitHub repo. That's actually what I did to get a copy of your gist when I made my repo.