Open ehildenb opened 3 years ago
I'd really appreciate a decent manual for k, this is really hard to get into without one.
Here is the proposed lesson outline for the proposed tutorial:
I will get started on the basic lessons shortly unless anyone has any last minute suggestions for alterations.
I would move "Debugging with gdb", "Backends and search", "Unification and symbolic execution", "Basic proofs / ensures" to intermediate concepts. This groups it with other specialized tooling that exists in intermediate, and it makes it so that basic concepts is focused on K, and not on tooling.
I would move "List, Map, and Set", "Function context", and "Record productions and named nonterminals" to the basic concepts.
I would also add a section (in intermediate) on configuration variables.
Do we want to include a section on using explicit klabels to work around K parser issues (e.g. what if my language syntax uses a reserved keyword in K?)?
Does the parser generation section cover semantics which consume multiple kinds of inputs (i.e. more than one input file or an input file and a string, etc.)?
@sskeirik see point (5) in intermediate for explicit klabels.
@sskeirik I believe the parser generation for other configuration variables is going to be covered in the section on cells and the configuration, because it can't really be introduced until we introduce configuration variables themselves. The parser generation section is going to focus on the basics of using Bison parsing, priority, associativity, prefer/avoid, and lexical syntax.
I tend to disagree with Everett's suggested reorderings, at least some of them. I definitely think proofs should be part of the basic concepts, and that necessitates building up to them with most of the other sections I mentioned. Plus, people are going to want to understand how to debug sooner rather than later because it's a very important basic concept when developing code, especially if you're not familiar with a language. Also, I guess I don't really see that function context and record productions are really basic features of the language compared to the other items in Basic Concepts right now. I'm not opposed to moving List/Map/Set earlier, though
@dwightguth I don't feel strongly about function context and record productions, so those can stay as you have put them. Let's please move List/Map/Set earlier though.
I do think we should not introduce more advanced tooling in basic concepts though. I rarely use the LLVM debugger, and I'm an advanced K user. I only use it to debug very tricky cases in the semantics or bugs in the LLVM backend itself, which should not come up in the basic concepts section. If you want to introduce debugging techniques in basic concepts, I would suggest instead going for showing them how to use --depth
.
Similarly for proving, I think that basic concepts should be focused on teaching K (the language), not on teaching what you can do with K. Proving is a use of K, not inherent to the K language itself.
In fact, I would say that the entire basic concepts section should be do-able with just the llvm backend (no symbolic execution). It should be focused on teaching K (the language), not K (the ecosystem of tools).
I've had a number of people come to me over the past year or so dissatisfied with the status quo of debugging in k, which essentially amounted to repeatedly executing code up to a certain depth until you find where the problem is. As a result, I built the gdb set of tools for debugging k definitions, which allows you to break when certain side conditions are evaluated, certain rules are applied, or certain functions are called, and immediately see the values of the arguments to the function, or the current configuration. It is also possible to step through the program one step at a time just by repeatedly hitting the enter key, which is a vastly superior experience to running a command, editing it to increment an integer, and then running it again. One person, no matter how much experience they have with k, who would rather use the old paradigm of debugging when they have already become experienced with it and mastered it, is not a compelling argument that it should be the first experience we show to new developers about how to code in k.
As far as the issue of symbolic execution and proofs, I called the section "basic concepts" for a reason. It's intended to teach the fundamental concepts of programming in K. I don't understand how anyone can seriously claim that symbolic execution and proofs aren't a basic concept of developing in K when it's one of the most important things we do with k, arguably even more important than concrete execution.
Once again, there is the separation between K (the language) and K (the tooling). We should definitely have a tutorial track focused entirely on K (the language), which I'm proposing your basic concepts track should cover.
Also keep in mind that there is a difference between learning a language and using a language. I don't think anyone will disagree with you that when using K, you should use the best possible tools available (that is, the debugger in this example). But when learning a language, it often can be fruitful to use less efficient tools to reduce the initial learning curve while still providing some insight into the internals of the tooling. Indeed, my guess is that when people are initially learning K, they will not be using any debugging techniques, just krun
directly. With the aid of the bison parser, this will still be very fast.
I strongly disagree that symbolic execution is part of basic concepts though. You must first explain what symbolic execution even is in general, not even specific to K. Then you can show how it works with K by introducing logical variables. That is a steep learning curve.
The first part of the tutorial should be focused on learning K the language, not on the surrounding tooling which requires more explanation. The reason for this is simple: keep the learning curve as shallow as possible. Learning to use a bunch of separate tools at the same time as learning the language itself is both distracting and more difficult for people who are trying to jump into K.
Also keep in mind how the tutorial will be presented. I would like the entirety of the Basic Concepts section to be hosted on MiPasa, so that people don't even need to install K to run. Only krun via the LLVM backend (with kx fast paths) will be available on MiPasa. The first part of the intermediate section can then say "ok, now you need to have a local install of K, so you have access to more tools from the ecosystem".
I'd like to hear from @grosu , who has the most experience of any of us teaching, and a junior developer who doesn't know K very well (@emarzion maybe?) to see what they think sounds more useful of the two approaches. Is it better to introduce more complex concepts that nevertheless form the foundation of long term understanding, like symbolic execution, earlier or later? What would k newbies actually rather use? krun --depth or a debugger? Neither of us has taught much in the past, and we both are senior devs, which gives us a bias to our thought process that is probably not helpful here. I hear what you're saying and you make good points, but I think we need to hear from people with a different perspective on these topics before making a decision one way or the other.
Well, to be clear, I have taught in the past. I TA'ed the programming language course several times while I was at UIUC, and have other teaching experience from before that. Obviously @grosu has a lot more experience than that.
But I'd like to emphasize again, teaching K should be done in-browser initially, and so we should focus on tools we can teach in-browser. Without significant work, we won't be able to run gdb in browser interactively, but we already have control over how krun
is invoked in the existing MiPasa prototype, so we can add the --depth
option.
There will be many types of people who want to learn a bit of K, ranging from people who want to know all the dirty details and how to use the advanced tooling, to those who just want to spend an afternoon trying it out. The initial part of the tutorial should be focused on servicing the second group.
I am surprised to see "User lists" before "List / Map / Set". I suppose I see more user lists than Lists, but Map and Set are very common. I tend to think cell multiplicity should be covered at the same time as configurations and cells; I don't think I've seen any non-trivial configuration that didn't use multiplicity. Then the tutorial would have to cover Map and Set before configurations. I would also consider covering the <k>
cell separately, because it is special.
I can move user lists later, but I can't move Map/Set before configurations because I can't introduce Map/Set easily until I introduce top level rules, because before that point, all of the examples consist of parsing a function term directly into the k cell and just evaluating the function, which I can't do with functions over maps and sets because Bison doesn't support the syntax of the MAP or SET module.
I'm not opposed to inserting a lesson prior to configurations in general that talks about the k cell, ellipses in cells, and k sequences, though.
Leaving aside the technical issues about browsers and focusing solely on conceptual difficulty, I don't think that symbolic execution is too hard to grasp; and I think that step-by-step debugging is nicer for beginners because it is easier to determine what exactly is going on. On the other hand, you could argue that packing in too many new tools and concepts and making the beginner tutorial confusing/difficult to use/etc. is a greater risk than making it too simple by setting them aside for the intermediate tutorial.
I think the objectives of the beginner section of the tutorial should be:
pending-documentation
)These are some more specific suggestion I have:
I think that multiplicity, functions, and fresh constants should be introduced in the beginner section since they're used in the most basic languages
Minimize the space of configuration space of options that the user must deal with. This means:
I feel that the web K can be useful for giving people a taste of K, however I think we should treat that as a separate objective from the tutorial.
Thanks so much @dwightguth for taking the lead on this, and everybody else for suggestions! As much as I'd like to jump into the discussions and have my own opinions, I feel that it is better to let you guys converge on it. I am just too biased and obsessed with foundations of PL. For example, I would definitely include a non-trivial language. I think our most interesting language currently is Agent, which was a modularized version of Challenge (from the original journal paper on K with @traiansf). That is a very tricky language, that would convince everybody of K's strength when it gets to defining complex, tricky languages. But again, I am biased. In case you do want to include that, though, probably @traiansf would be the best person to write that section. One thing I can do is to run this tutorial by my students this semester, in CS422.
Well, as the lessons get more advanced, I expect some of the lesson's exercises will involve improving and modifying a programming language. I think it's more likely we're going to focus on some kind of relatively simple imperative language for the most part though, with various features added bit by bit in order to demonstrate features of K.
I'm not sure why amb
is that late in the list. It's one of the first things that new users struggle with when they are developing their languages.
I would make a bullet in the basic topics with syntax specification that would include BNF, disambiguation (priorities, associativity...), parser generation.
Are you planning on including anywhere the fact that K is parsed in multiple steps?
@radumereuta we are going to cover ambiguities during the syntax generation lesson, which is currently the second lesson out of all of them. The advanced lesson on amb
deals with how to use the llvm backend and the bison parser to deal with languages that can't be expressed by context free grammars and require the use of rewriting to disambiguate.
@dwightguth As a K beginner, I think the current proposed format is quite reasonable.
I would also like to learn a bit about how to add more built-in data types, esp. complex ones (e.g. matrix).
@ehildenb if you find the time please look into writing the proofs tutorial.
Current tutorial is focused on teaching programming languages, not on teaching K. We need a new tutorial that focuses on teaching K by starting out with easy K definitions which solve simple tasks (eg. "calculate fibonacci numbers"), and not on building programming languages.