runtimeverification / k

K Framework Tools 7.0
BSD 3-Clause "New" or "Revised" License
443 stars 145 forks source link

Tutorial for learning K #1655

Open ehildenb opened 3 years ago

ehildenb commented 3 years ago

Current tutorial is focused on teaching programming languages, not on teaching K. We need a new tutorial that focuses on teaching K by starting out with easy K definitions which solve simple tasks (eg. "calculate fibonacci numbers"), and not on building programming languages.

boggle commented 3 years ago

I'd really appreciate a decent manual for k, this is really hard to get into without one.

dwightguth commented 3 years ago

Here is the proposed lesson outline for the proposed tutorial:

Basic concepts

  1. Productions / the rewrite operator / matching / variables / rules / attributes
  2. BNF syntax / parser generation
  3. Modules and imports
  4. Integers and booleans
  5. Side conditions and owise
  6. Literate programming in Markdown
  7. Unparsing, brackets, and the format and color attributes
  8. Strings
  9. User lists
  10. The k cell / top level rules / cell ellipses / the k sequence operator
  11. Configuration declarations / cell nesting
  12. Strictness, contexts, and other forms of evaluation order
  13. Casts
  14. List / Map / Set
  15. Cell multiplicity
  16. Term equality and conditionals
  17. Debugging with gdb
  18. Backends and search
  19. Unification and symbolic execution
  20. Basic proofs / ensures

Intermediate concepts

  1. Macros, aliases, and anywhere rules
  2. Fresh constants
  3. Explicit klabels
  4. Overloaded productions
  5. Matching logic connectives / or patterns
  6. Function context
  7. Record productions and named nonterminals
  8. fun / #let bindings

  9. as

  10. :=K and :/=K
  11. Floating point and machine integers
  12. Substitution
  13. I/O
  14. String buffers and Bytes
  15. KORE
  16. The Haskell backend REPL

Advanced topics

  1. Layout

  2. amb
  3. location

  4. JSON
  5. Rat
  6. FFI
  7. Custom hooks
  8. Scripting

I will get started on the basic lessons shortly unless anyone has any last minute suggestions for alterations.

ehildenb commented 3 years ago

I would move "Debugging with gdb", "Backends and search", "Unification and symbolic execution", "Basic proofs / ensures" to intermediate concepts. This groups it with other specialized tooling that exists in intermediate, and it makes it so that basic concepts is focused on K, and not on tooling.

I would move "List, Map, and Set", "Function context", and "Record productions and named nonterminals" to the basic concepts.

I would also add a section (in intermediate) on configuration variables.

sskeirik commented 3 years ago

Do we want to include a section on using explicit klabels to work around K parser issues (e.g. what if my language syntax uses a reserved keyword in K?)?

Does the parser generation section cover semantics which consume multiple kinds of inputs (i.e. more than one input file or an input file and a string, etc.)?

ehildenb commented 3 years ago

@sskeirik see point (5) in intermediate for explicit klabels.

dwightguth commented 3 years ago

@sskeirik I believe the parser generation for other configuration variables is going to be covered in the section on cells and the configuration, because it can't really be introduced until we introduce configuration variables themselves. The parser generation section is going to focus on the basics of using Bison parsing, priority, associativity, prefer/avoid, and lexical syntax.

dwightguth commented 3 years ago

I tend to disagree with Everett's suggested reorderings, at least some of them. I definitely think proofs should be part of the basic concepts, and that necessitates building up to them with most of the other sections I mentioned. Plus, people are going to want to understand how to debug sooner rather than later because it's a very important basic concept when developing code, especially if you're not familiar with a language. Also, I guess I don't really see that function context and record productions are really basic features of the language compared to the other items in Basic Concepts right now. I'm not opposed to moving List/Map/Set earlier, though

ehildenb commented 3 years ago

@dwightguth I don't feel strongly about function context and record productions, so those can stay as you have put them. Let's please move List/Map/Set earlier though.

I do think we should not introduce more advanced tooling in basic concepts though. I rarely use the LLVM debugger, and I'm an advanced K user. I only use it to debug very tricky cases in the semantics or bugs in the LLVM backend itself, which should not come up in the basic concepts section. If you want to introduce debugging techniques in basic concepts, I would suggest instead going for showing them how to use --depth.

Similarly for proving, I think that basic concepts should be focused on teaching K (the language), not on teaching what you can do with K. Proving is a use of K, not inherent to the K language itself.

ehildenb commented 3 years ago

In fact, I would say that the entire basic concepts section should be do-able with just the llvm backend (no symbolic execution). It should be focused on teaching K (the language), not K (the ecosystem of tools).

dwightguth commented 3 years ago

I've had a number of people come to me over the past year or so dissatisfied with the status quo of debugging in k, which essentially amounted to repeatedly executing code up to a certain depth until you find where the problem is. As a result, I built the gdb set of tools for debugging k definitions, which allows you to break when certain side conditions are evaluated, certain rules are applied, or certain functions are called, and immediately see the values of the arguments to the function, or the current configuration. It is also possible to step through the program one step at a time just by repeatedly hitting the enter key, which is a vastly superior experience to running a command, editing it to increment an integer, and then running it again. One person, no matter how much experience they have with k, who would rather use the old paradigm of debugging when they have already become experienced with it and mastered it, is not a compelling argument that it should be the first experience we show to new developers about how to code in k.

As far as the issue of symbolic execution and proofs, I called the section "basic concepts" for a reason. It's intended to teach the fundamental concepts of programming in K. I don't understand how anyone can seriously claim that symbolic execution and proofs aren't a basic concept of developing in K when it's one of the most important things we do with k, arguably even more important than concrete execution.

ehildenb commented 3 years ago

Once again, there is the separation between K (the language) and K (the tooling). We should definitely have a tutorial track focused entirely on K (the language), which I'm proposing your basic concepts track should cover.

Also keep in mind that there is a difference between learning a language and using a language. I don't think anyone will disagree with you that when using K, you should use the best possible tools available (that is, the debugger in this example). But when learning a language, it often can be fruitful to use less efficient tools to reduce the initial learning curve while still providing some insight into the internals of the tooling. Indeed, my guess is that when people are initially learning K, they will not be using any debugging techniques, just krun directly. With the aid of the bison parser, this will still be very fast.

I strongly disagree that symbolic execution is part of basic concepts though. You must first explain what symbolic execution even is in general, not even specific to K. Then you can show how it works with K by introducing logical variables. That is a steep learning curve.

The first part of the tutorial should be focused on learning K the language, not on the surrounding tooling which requires more explanation. The reason for this is simple: keep the learning curve as shallow as possible. Learning to use a bunch of separate tools at the same time as learning the language itself is both distracting and more difficult for people who are trying to jump into K.

Also keep in mind how the tutorial will be presented. I would like the entirety of the Basic Concepts section to be hosted on MiPasa, so that people don't even need to install K to run. Only krun via the LLVM backend (with kx fast paths) will be available on MiPasa. The first part of the intermediate section can then say "ok, now you need to have a local install of K, so you have access to more tools from the ecosystem".

dwightguth commented 3 years ago

I'd like to hear from @grosu , who has the most experience of any of us teaching, and a junior developer who doesn't know K very well (@emarzion maybe?) to see what they think sounds more useful of the two approaches. Is it better to introduce more complex concepts that nevertheless form the foundation of long term understanding, like symbolic execution, earlier or later? What would k newbies actually rather use? krun --depth or a debugger? Neither of us has taught much in the past, and we both are senior devs, which gives us a bias to our thought process that is probably not helpful here. I hear what you're saying and you make good points, but I think we need to hear from people with a different perspective on these topics before making a decision one way or the other.

ehildenb commented 3 years ago

Well, to be clear, I have taught in the past. I TA'ed the programming language course several times while I was at UIUC, and have other teaching experience from before that. Obviously @grosu has a lot more experience than that.

But I'd like to emphasize again, teaching K should be done in-browser initially, and so we should focus on tools we can teach in-browser. Without significant work, we won't be able to run gdb in browser interactively, but we already have control over how krun is invoked in the existing MiPasa prototype, so we can add the --depth option.

There will be many types of people who want to learn a bit of K, ranging from people who want to know all the dirty details and how to use the advanced tooling, to those who just want to spend an afternoon trying it out. The initial part of the tutorial should be focused on servicing the second group.

ttuegel commented 3 years ago

I am surprised to see "User lists" before "List / Map / Set". I suppose I see more user lists than Lists, but Map and Set are very common. I tend to think cell multiplicity should be covered at the same time as configurations and cells; I don't think I've seen any non-trivial configuration that didn't use multiplicity. Then the tutorial would have to cover Map and Set before configurations. I would also consider covering the <k> cell separately, because it is special.

dwightguth commented 3 years ago

I can move user lists later, but I can't move Map/Set before configurations because I can't introduce Map/Set easily until I introduce top level rules, because before that point, all of the examples consist of parsing a function term directly into the k cell and just evaluating the function, which I can't do with functions over maps and sets because Bison doesn't support the syntax of the MAP or SET module.

I'm not opposed to inserting a lesson prior to configurations in general that talks about the k cell, ellipses in cells, and k sequences, though.

emarzion commented 3 years ago

Leaving aside the technical issues about browsers and focusing solely on conceptual difficulty, I don't think that symbolic execution is too hard to grasp; and I think that step-by-step debugging is nicer for beginners because it is easier to determine what exactly is going on. On the other hand, you could argue that packing in too many new tools and concepts and making the beginner tutorial confusing/difficult to use/etc. is a greater risk than making it too simple by setting them aside for the intermediate tutorial.

nishantjr commented 3 years ago

I think the objectives of the beginner section of the tutorial should be:

These are some more specific suggestion I have:

  1. I think that multiplicity, functions, and fresh constants should be introduced in the beginner section since they're used in the most basic languages

  2. Minimize the space of configuration space of options that the user must deal with. This means:

    • While the bison parser is nice to have, I feel that users may easily run into its limitations, and need to deal with frustrating behavior such as non-termination or crashes without good error messages. It also means that users need to deal with the like LR grammars which adds complexity. Perhaps we can revisit this when the bison parser is the default parser?
    • It is better to introduce only one backend. Since one of the primary goals of K is verification, I feel (but not very strongly) that this should be the Haskell backend
    • For a frustration free experience, we need to introduce some debugging technique early, whether for the Haskell backend or LLVM
  3. I feel that the web K can be useful for giving people a taste of K, however I think we should treat that as a separate objective from the tutorial.

grosu commented 3 years ago

Thanks so much @dwightguth for taking the lead on this, and everybody else for suggestions! As much as I'd like to jump into the discussions and have my own opinions, I feel that it is better to let you guys converge on it. I am just too biased and obsessed with foundations of PL. For example, I would definitely include a non-trivial language. I think our most interesting language currently is Agent, which was a modularized version of Challenge (from the original journal paper on K with @traiansf). That is a very tricky language, that would convince everybody of K's strength when it gets to defining complex, tricky languages. But again, I am biased. In case you do want to include that, though, probably @traiansf would be the best person to write that section. One thing I can do is to run this tutorial by my students this semester, in CS422.

dwightguth commented 3 years ago

Well, as the lessons get more advanced, I expect some of the lesson's exercises will involve improving and modifying a programming language. I think it's more likely we're going to focus on some kind of relatively simple imperative language for the most part though, with various features added bit by bit in order to demonstrate features of K.

radumereuta commented 3 years ago

I'm not sure why amb is that late in the list. It's one of the first things that new users struggle with when they are developing their languages. I would make a bullet in the basic topics with syntax specification that would include BNF, disambiguation (priorities, associativity...), parser generation.

Are you planning on including anywhere the fact that K is parsed in multiple steps?

dwightguth commented 3 years ago

@radumereuta we are going to cover ambiguities during the syntax generation lesson, which is currently the second lesson out of all of them. The advanced lesson on amb deals with how to use the llvm backend and the bison parser to deal with languages that can't be expressed by context free grammars and require the use of rewriting to disambiguate.

MirceaS commented 3 years ago

@dwightguth As a K beginner, I think the current proposed format is quite reasonable.

boggle commented 3 years ago

I would also like to learn a bit about how to add more built-in data types, esp. complex ones (e.g. matrix).

radumereuta commented 2 years ago

@ehildenb if you find the time please look into writing the proofs tutorial.