ossu / computer-science

:mortar_board: Path to a free self-taught education in Computer Science!
MIT License
165.59k stars 20.92k forks source link

Choosing the introductory course(s) and language #540

Closed joshmhanson closed 4 years ago

joshmhanson commented 5 years ago

Much has been left undecided with regard to what Intro CS is going to look like in v9 of the curriculum.

Not wanting to start from scratch (i.e. from my own intuition) in answering this question, I went looking for what experts had to say. I was lucky enough to encounter a dialogue between Yaron Minsky (an experienced industrial developer) and Matthias Felleisen (one of the top academics in computer science pedagogy, and author of How to Design Programs), underneath a blog post on this very subject.

Yaron Minsky:

The first formal programming class I took was COS 217 at Princeton, taught by the excellent (and at the time, I thought, terrifying) Anne Rogers. The course was (and is) taught in C, and the intellectual approach of the class was to start from the machine. Instead of just learning to program in C, we learned about how the machines we were programming to worked. That was where I first encountered instruction counters, stack frames, registers and the memory hierarchy. It was exhilarating.

Where C encourages you to start with the machine, Scheme wants you to start at the mathematical foundations of computation. You don’t need to know what the lambda calculus is to appreciate Scheme’s slim core, and the way in which you can build a rich and vibrant computational world on top of it. That core is expressive enough that it makes it natural to introduce ideas that come up in multiple different languages, including functional and imperative techniques, object orientation, and logic programming.

The classic course in this vein is MIT’s 6.001, also known as SICP, The Structure and Interpretation of Computer Programming. Sadly, 6.001 has been retired at MIT, but the book lives on, and is a worthwhile read even if you took your last CS course years ago.

MIT replaced SICP with a course based on Python, and this reflects a broader trend. As was highlighted by an informal study by Philip Guo, lots of schools now teach Python, particularly for early introductory courses. I have mixed feelings about this choice. Python is a wonderfully friendly language, but that friendliness is bundled together with some problems.

This was made apparent to me in part by my experience with students who chose to code in their interviews in Python. In many ways, Python is the ideal interview language, since its concise and readable syntax makes the constraints of coding on the whiteboard more bearable. But what I saw was that students who learn Python often walk away with a rather rough model of the semantics of the language. You might be surprised at what fraction of students who have programmed extensively in Python can’t guess how Python lists might be implemented, to say nothing of their ability to explain the semantics of language features like generators or decorators.

This isn’t really a knock on Python. After all, there’s something great about a tool that lets you get things done without fully understanding how it works. But in different ways, both Scheme and C encourage you to understand what’s going on from the ground up, and there’s a certain pedagogical power in that. All in, I think Python is a great choice for an early introductory course, particularly one meant for those who aren’t going to end up as computer scientists or full-time programmers. But I’d be leery of using it outside of those circumstances.

Matthias Felleisen:

Yaron, your question -- as stated -- doesn't denote. It needs context. And Bob Muller's "failure" is indicative of this problem.

Let's assume you really mean "teaching programming to young students in high school or in college at the freeman level" and let's accept that we should not send out kids with one programming course into industry because we also don't send out first-semester pharmacists into pharmacies.

Here is what you will want. First, it needs to be a bit of fun. If your first program is print "hello world", these kids can't relate. They haven't seen consoles. They have seen graphical user interfaces, they have seen interactive games (and may have played them) and they have seen animations. A lecturer should show a rocket taking off or a horse bugging going across the screen as the hello-world program, the very first program you ever show.

Second, you are confronting novices. When they get to touch the keyboard, they are guaranteed to do one and only one thing: make mistakes. ALL (note all caps) off-the-shelf languages are geared to give error feedback assuming "adult" programmers (people who know most of the language) are the actual users. Invariably this feedback is weird and incomprehensible to novices. ERRORS MATTER.

Third, the language should frame the problem-solving method. Ergo, you first pick a method that gradually takes students from an empty screen and a problem statement to a program that solves the problem (probably). Based on that choice, you create the language that frames this design story. I fully agree that (rich) types (as found in ML or Haskell) are the fundamental guiding system for design. (I have taught with types since 1992.) That does NOT mean that you need to check those types from the beginning. Why? See the second point. A type system is likely to give horrible feedback when novices do what they do best: make mistakes.

Fourth, the second and third point suggest that you use more than one language. And the third point means that the languages "grow" on the students with the expansion of the design methodology. Naturally this should end in a typed setting like the ones that you propose. Does it have to be OCaml or SML or Haskell? Bob Muller's experience shows that this is not so. You need to create a language that fits.

Fifth, there is no contradiction between choosing "Scheme" and still teaching about "instruction counters, stack frames, registers and the memory hierarchy." When I taught at Rice, my course started with a Scheme-like language and always ended with us writing a complete simulator of a cpu, an instruction interpreter, a memory. Then students would write a loader, a linker, and hand-translate Fortran (yes, oh well) into the asm of the simulated machine -- all while observing the functional design recipes that they picked up in the first 12 weeks of the course.

With that in mind, below I'm listing some ideals I want to strive towards.

My goal now is to evaluate each of the following resources on a spectrum (🙁, 😐, and 😀) based on the above four criteria. I may need help for this since I haven't done all of these course myself.

I will come back later to start adding my own evaluations, but anyone is free and encouraged to contribute their own, to provide feedback on the criteria I listed, or to suggest other intro courses.

Alaharon123 commented 5 years ago

I don't have any comment, but I just want to drop another thing to look at. The author of HtDP wrote a paper comparing SICP to HtDP and talking about intro cs classes https://www2.ccs.neu.edu/racket/pubs/jfp2004-fffk.pdf

joshmhanson commented 5 years ago

Thanks @Alaharon123!

The paper you linked reinforces many of the reasons I have for stating the ideals for an introductory course. Broadly, I think the HtDP approach is more appropriate than the SICP approach because absolute beginners need to be told exactly what to do, which the design recipes do well.

Response to Philip Wadler

The paper also references Philip Wadler's critique of SICP: https://www.cs.kent.ac.uk/people/staff/dat/miranda/wadler87.pdf

I think this paper is worth responding to. He likes SICP a lot but his complaints are on the choice of Scheme as its language. He feels that Scheme not having the following features are a problem:

  1. Pattern-matching.
  2. A syntax close to traditional mathematical notation.
  3. A static type discipline and user-defined types.
  4. Lazy evaluation.

(It is worth noting that his experience is based on teaching SICP to undergraduates and teaching Miranda to Masters of Science graduate students in functional programming. Is it any surprise the MSc students fared better?!)

My commentary on these criticisms, in the context of choosing a language and course material for college-level absolute beginners:

  1. Pattern-matching: Disagree. Pattern matching is syntactic sugar for something more complex going on; this complexity should be added later as a convenience, once the student understands how and why it's useful.
  2. Mathematical notation: Partially agreed. It is the one thing that annoys me about using Scheme/Racket as a language for beginners, who have been using infix functions (+ - / ×) since elementary school. On the other hand, the lambda calculus is all prefixes just like Scheme, so it can be useful if you're teaching them side-by-side.
  3. Static types: Neutral. This can work both ways. You can start with a single static unitype (like JavaScript's "any" type) and then gradually add type refinements until you arrive at static multityping, or you can start with static multityping and build more and more general types as needed (which requires user-defined types). I write more about this below.
  4. Lazy evaluation: Completely disagree. Philip Wadler is ridiculously intelligent and he, by his own admission, has no idea what's hard for normal people. Very smart people often have trouble reasoning about the performance of a program written in a lazy programming language. Why should we inflict this on absolute beginners?

More commentary on a student's first language being dynamically or statically typed

One could argue that starting with static types is theoretically preferable for teaching purposes due to research suggesting that the earlier a student gets feedback, the better. (The "feedback" here is type error messages emitted by the compiler.)

However, in practice, the How to Design Programs team uses a dynamically typed language because those compiler type error messages tend to be very novice-hostile.

I would offer another reason why it's better to start with dynamic types: the original lambda calculus was untyped. It was the most expressive of all lambda calculi at the lowest "level" so to speak. Newer lambda calculi were developed later that were less expressive at lower levels, in order to achieve more expressiveness at higher levels.

It was probably easiest for Church to start with the lowest level and then add higher levels later. I suggest that absolute novices will likewise find it easier to move in a similar progression up the expressiveness ladder rather than starting at a higher level first and moving down. Therefore, we should start with something that looks more like the untyped lambda calculus (strictly speaking, untyped != dynamically typed, but I'm speaking by analogy), and then later move to languages that more literally resemble the typed lambda calculus, then System F, etc.

(Edit for clarity: I realize some people may be confused by my phrasing of "higher-level" and "lower-level" since these can mean many different things. I am not referring to "low-level code" like assembly or binary. I am speaking by rough analogy in terms of type theory. You can gain an intuition about these "levels" by reading this Stack Overflow discussion.)

joshmhanson commented 5 years ago

Below is my evaluation of Harvard CS50.

Conceptual simplicity: 😀. (Assuming we limit the course to C.)

Practical simplicity: 🙁.

Level-appropriateness: 🙁.

Resource richness: 😀.

joshmhanson commented 5 years ago

Below is my evaluation of MIT SICP.

Conceptual simplicity: 😀.

Practical simplicity: 🙁.

Level-appropriateness: 😐.

Resource richness: 😐.

joshmhanson commented 5 years ago

Below is my evaluation of Systematic Program Design.

Conceptual simplicity: 😀.

Practical simplicity: 😀.

Level-appropriateness: 😀.

Resource richness: 😐 (tentative).

joshmhanson commented 5 years ago

In issue #530 @Pranaybee suggested that we add a course on introductory logic in Intro CS. I am responding here.

A few points:

In summary, I don't think it's necessary at all to include a full course on logic in our introduction. However, I encourage you to open issues for any course where you feel you weren't properly prepared in terms of background knowledge on logic, and we can figure out how to resolve those individual problems. I had already studied logic in college before starting Nand2Tetris I, so I was well-prepared and actually worked out (on paper) much of the homework in the concise logic syntax I was familiar with. It's certainly not impossible that others would need such preparation, but I've so far not heard any feedback indicating as such.

bradleygrant commented 5 years ago

From Minsky, above:

students who learn Python often walk away with a rather rough model of the semantics of the language. You might be surprised at what fraction of students who have programmed extensively in Python can’t guess how Python lists might be implemented, to say nothing of their ability to explain the semantics of language features like generators or decorators.

It's a very tall order to get from "hello world" to "generators and decorators" in a one-semester class in Python, especially a first programming class, and as they're language-specific features (and the official position of OSSU is to not teach a single specific language), I'm not sure that level of inquiry is helpful or necessary in an intro class. It's a valid criticism, one that's on a similar level to "CS grads these days can't even code", but I think it tells more about the state of the industry than the goal of any one class.

To be clear, I don't think that was your intention anyway. You're bringing up the "what first language is best?" debate, and as the authors have alluded, the best answer may be "all of them". I'll write some more thoughts on this under separate cover.

bradleygrant commented 5 years ago

The "perfect CS intro class" is a pretty hard problem to solve, because you're trying to teach several things simultaneously:

And from an administrative perspective, you want that class to do several things as part of the overall program:

OSSU has additional challenges and requirements, owing to the fact that the materials have to be freely available and that the students are self-motivated and self-paced. MIT can tell their in-seat students "Endure this pain and get your A, in doing so we will forge you into our image". That works great for the kids who got into MIT and have a lot to lose by failing out. MOOC participants can simply bounce out and find another class (or not).

So the responsibility of having to choose a perfect first class is not an enviable one! I solved this problem for myself by taking several of them. And in my estimation, it was not a waste of time; I gained something different from every class. That's not an ideal solution either (though I didn't mind it); there's duplicated effort, but it exposed me to a different number of teaching styles, coding styles and pedagogical choices.

For your consideration:

So we've got pieces of classes that are individually excellent. I'm not sure what to do with all this in order to get to a "CS intro mega-master course".

I'll submit to you that my wife is currently taking paid intro to CS classes at our local university. Their approach is to teach the language constructs one semester, computer science basics the second semester, and data structures in the third semester. Maybe the intro class doesn't need to be one class... (and I feel like I've learned more for free than she's paid to learn!)

joshmhanson commented 5 years ago

Since several courses use Python, and I seem to have been knocking it earlier, I want to add some clarifications on my specific thoughts regarding using Python in an introductory context.

Yaron Minsky wrote:

This was made apparent to me in part by my experience with students who chose to code in their interviews in Python. In many ways, Python is the ideal interview language, since its concise and readable syntax makes the constraints of coding on the whiteboard more bearable. But what I saw was that students who learn Python often walk away with a rather rough model of the semantics of the language. You might be surprised at what fraction of students who have programmed extensively in Python can’t guess how Python lists might be implemented, to say nothing of their ability to explain the semantics of language features like generators or decorators.

My major response to this is that, for the purposes of introductory computing, manner of teaching overrides actual knowledge of a language. (In other words, given a sufficiently good teacher, you can teach introductory computing with basically any language. It's just that some are more convenient than others.)

That some beginning Python users appear not to have a crystal clear understanding of their language's semantics may be an inconvenience with Python itself (with which I am not terribly familiar), or it may be a problem with the way it is taught.

You could make the same mistakes when teaching any other language. Let's take singly-linked lists as an example. I think it would be a mistake to teach them without, in parallel (whether before or after showing list literal syntax), defining them in a precise way, in the same way you define any other compound structure as a composition of atomic structures. For example, some materials might first introduce them like so:

type 'a mylist =
  | Empty
  | Cons of 'a * 'a mylist

They would explain what this means and how to use it using other language primitives, for example:

Cons ('a', (Cons ('b', (Cons ('c', Empty)))))

Then, they might show some intermediate form using more typical list syntax with [] instead of Empty and :: instead of Cons:

'a' :: 'b' :: 'c' :: []

And then finally showing the full list literal syntax: ['a', 'b', 'c'].

This kind of progression is basically what I would hope to see in an introductory course — e.g., clear definitions of very common compound structures using simple primitives. If we choose to start on this layer of abstraction, then we shouldn't be concerned with exactly how this is implemented on a lower-level (e.g. using pointers, understanding of which requires a whole new set of hardware-oriented primitives), only with the abstract mathematical structure.

Back to Python: I have no idea how Python lists are implemented (I'm guessing they are dynamic arrays of pointers to PyObjects?), but this doesn't actually matter. How a language is implemented is completely irrelevant in the context of teaching it; a good teaching language may actually need a very complicated implementation!

What's needed is a good teacher. It is perfectly acceptable to tell "lies" about how something works — perhaps pretending that Python lists are singly-linked lists, and using comments to describe the type of these lists since Python lacks user-defined type syntax. (These aren't necessarily "lies", but rather models, which are always, by definition, a simplification of reality.) Then the teacher can later correct the lie once the student has learned the more advanced topics needed to truly understand what's going on.

Some languages are designed to require as few "lies" as possible, and these are more convenient for guiding the teacher, but these languages may not necessarily overlap with the best overall languages for teaching. We have to weight the benefits and drawbacks in every individual case.

joshmhanson commented 5 years ago

@bradleygrant I didn't see your two posts until I posted mine (it took me a while to write) so don't assume anything I said is a direct response :)

bradleygrant commented 5 years ago

Here's a write-up of GTx's CS1301 Computing in Python series (and the beginning of a generalized discussion of Python).

Discussions of Python's conceptual simplicity are over my head at this stage, and I will defer to others.

Practical simplicity: 😐/😀.

Level appropriateness: 😐/😀.

I'll defer to @hanjiexi et. al. on what features of a programming language make it a good first programming language pedagogically, but for me, I wanted to quickly get up to speed in a language in wide use in industry, and Python fit the bill for me.

Resource richness: 😀 (specific to GTx CS1301).

joshmhanson commented 5 years ago

Thanks for your help @bradleygrant! Below is my feedback.

The "perfect CS intro class" is a pretty hard problem to solve, because you're trying to teach several things simultaneously:

  • how computers work on a basic, fundamental level
  • how to think in machine logic
  • how to express those thoughts in a way that's comprehensible to both computer and human
  • how to speak and understand a literal, brand-new language
  • how to use these building blocks to accomplish something useful

When we had CS50, the scope was indeed this broad in Intro CS. But what we're discussing as part of the broader changes for v9 is to "refactor" our curriculum such that each section has a more focused scope, even creating sub-sections as needed.

I don't think Intro CS should cover in detail how computers work, if by "computers" you mean modern (von Neumann) machines. Modern machines are merely a historical accident; they could have evolved some other way using a very different mechanism. For example, someone once created a computer in Minecraft.

A "computer" can be anything as long as it is simulating a model of computation. The models came first; the machines came decades later. ("Computer" used to actually mean a "human being performing computations", a.k.a. "doing math".)

So I think the Intro CS class should cover how basic computation works, while splitting out how modern machines work in other coursework (since you do have to know this too, after all). We have Nand2Tetris for providing a simplified model of how most contemporary machines work. And we can have a course in Foundations for getting people up to speed on a much, much more basic level.

Regarding "how to think in machine logic": I am not sure what you mean by this, as there is no one thing called machine logic. There are many logics, and there are many possible machines. Perhaps you meant sequential/imperative logic, but machine architectures have been proposed that operate as networks of parallel reduction machines. But I think I am probably just getting too pedantic about your choice of words; I will assume you meant "some logic of any useful language".

My comment on the above is that, if we want to refine what "some logic" refers to, I would prefer (but not too strongly) that the introductory material chosen focuses on combinational logic / functional programming as this provides the most direct transition from high school math.

I say "not too strongly" because people do use imperative style computation sometimes in their day to day lives, such as when sorting objects on a table. But I think it's more complicated (pedagogically) because it leads immediately to a notion of "side-effects", where you start defining what look like mathematical functions, but such "functions", in the process of mapping inputs to outputs, also "do something". It's hard intuitively to find an analog in the real world for this notion, since students have been accustomed so far to keeping math (pure functions) and "doing stuff" (procedures) totally separate.

You have to understand though that I'm speaking in long time scales. Obviously, you know from experience that learning Python on its own as a beginner is very easy, regardless of its divergences from basic math. I'm talking about the overall journey from novice to master. A student's first language impacts how they think tremendously, and later courses (and their lifelong career) can be easier or harder depending on whether they got a very strong conceptual foundation in the beginning. Concretely speaking, if you've spent years writing programs in an ad hoc way (mixing side-effects freely into functions), you will have much less practice with good architectural practices, like confining I/O to the edges of an application and keeping the core business logic clean and de-coupled. (See this video, The Pits of Success.)

In other words, how much convenience do we want to buy in exchange for knowing that we will need to force students to "unlearn" things down the road? It sucks to unlearn things, especially if you absorbed them a long time ago, but it also totally completely sucks to learn the actual lambda calculus itself on day one. Or to be forced to use a weird text editor from the 60s or solve annoying IT problems when all you wanted to learn was computer science.

And from an administrative perspective, you want that class to do several things as part of the overall program:

  • attract and introduce students to the program
  • be rigorous enough to teach something of use, so students see the value of the larger program
  • simultaneously, be attainable such that students complete the class and gain a sense of achievement
  • set expectations for the rest of the curriculum, giving students an opportunity to funnel in or sieve out ("this is for me" vs "this is definitely not for me")

Yes, yes, yes, and ... sort of! For the last one, I don't want to deliberately funnel people out. They will funnel out by their own choice, but we should discourage this by having a very friendly and engaging intro course. Talent is relatively irrelevant in the long term; any person in good mental health can learn computer science or mathematics.

  • [CS50] also introduce[s] Linux and git very early, something no other class I've taken does.

These are strictly of practical importance when applying CS, particularly at scale, but at the intro stage they're not applying it yet, they're learning it just on their own computers. So I would count that against CS50's level appropriateness.

joshmhanson commented 5 years ago

For your (very useful!) feedback on GT Computing in Python, for brevity I'll try to respond in a more general way instead of quoting every specific question.

Per one of my previous comments, what matters is how the course handles the teaching of Python, not necessarily Python itself. For example, it is good that they provide an in-browser environment to work around difficulties with using Python locally.

The issues you pointed out with Python are not necessarily deal-breakers. Does the course do a lot of handwaving, or are you only talking about Python? Does the course provide a simplified (largely OOP-free) model of how everything works before introducing the object/class system? If not, does it provide a heavily simplified model of OO?

Autocasting behavior of Python looks terrifying. I have a reasonable (though rough) understanding of dynamic types, but autocasting is very mysterious to me because it isn't clear at all just from reading what an expression like 5 * False is going to produce. To me that expression is meaningless.

Relatedly, the overloading of + also looks really strange. I understand why 1 + 1.1 would convert 1 to 1.0 (returning 2.1) in a language with this feature, since the right-hand side has more precision than the left, but what is going on with using + on anything but a number?

From the above you might be starting to gain an intuition for what I mean by "simple" as opposed to "easy". Easy means there is lots of magic going on, but the magic is inconsistent such that it doesn't seem to form a cohesive logic, at least not until you've grasped the entire language and its implementation. Simple means there is no magic, and/or the "magic" (usually just syntactic sugar) that's there is so well designed that you can understand its logic with minimal effort.

So, while I can't comment to much detail on how conceptually simple Python might be (without knowing more about it), my real goal here is to learn how the course handles teaching it. I note that Berkeley 61A uses the Python language to teach concepts from SICP, but I haven't actually looked through the course materials yet to see how they're doing it. It could make for an interesting comparison.

joshmhanson commented 5 years ago

Forgot to mention. You can consider practical simplicity to be 😀. And are all course materials and features available for free? If so, resource richness should also be considered 😀.

bradleygrant commented 5 years ago

As a quick aside, your statements around Berkeley 61A and "learning to un-learn later" got me thinking.

In a lot of college programs (at least, in the ten or so I've reviewed), that first class serves a dual purpose. To what extent do you want/envision the OSSU v9 intro class to be a pure computer science class vs. a computer programming class?

How much theory, vs. how much "hello world"?

I realize I may actually be asking about the philosophical overview of the intent of OSSU...but that's probably a valid thing to consider.

joshmhanson commented 5 years ago

That's a great question!

In my view, the purpose of theory is to enhance practice. The purpose of practice is [insert philosophy of life here]. Perhaps you wish to enhance the human race, or make money, or whatever. Regardless, you need to do something, and to do something effectively, you need theory to know what to do. Without theory (i.e. thought) we'd still be living in caves!

By analogy, let's say you want to build spaceships. We need math to figure out how spaceships can fly. We need practice to be good at math. Math is theory. So it all starts with theory, but you need practice to understand it. But that actually wasn't an analogy: programming is math. A program is just a long math formula. You need to program over and over again to understand useful formulas / programming patterns.

With all that context out of the way, it should be clear that I consider a computer programming course to be no different from a math course with a heavy focus on exercises/projects.

And that in a nutshell is my eventual goal for OSSU CS. I want it to be as robust in theory as in practice, but right now we still don't have a robust projects/grading infrastructure. I'm trying to make the content more robust before tackling this huge problem.

For various reasons, most schools focus too much on a very superficial interpretation of "practice" without really requiring strong theory, and a small number of schools go overboard with theory without requiring students to actually use it enough for big and cool projects. There are many incidental, political, economic, and social drivers for this, but none of them are relevant to us. We aren't accountable to anyone nor do we have to worry about making money. We don't even need to be popular. I want our curriculum to be respected, which happens to bring some stable, long-term popularity. Short-term popularity is easy to gain (by watering things down) and easy to lose.

An aside: I think I consider print "hello world" to be a pedagogical anti-pattern for being the first lesson in a course for absolute novices. It assumes you are in a Unix-like environment with a terminal and know what "standard output" means. And it's only really amusing to people who get that it's a reference to K&R.

If you are totally fresh to this world, what might you expect to happen? "Hello world" to appear in a pop-up box on the screen? Should the print command actually talk to your printer, and print out a piece of paper that says Hello world on it? Unless you are in a more verbose language where these things are specified (which brings its own problems), why should it appear in a terminal, when print doesn't say anything about a terminal or "standard output"?

It just raises too many questions for my tastes. You have to start somewhere of course, and I think a REPL environment is better for day 1, especially if it's very clean (no type annotations, etc.).

szaitseff commented 5 years ago

I completed both Harvard's CS50x and MIT's 600.1x earlier this year, and can give a feedback on the side of the CS Curriculum students. Both courses are of very high quality, but they are very different. Non of them could be considered as "perfect".

The 600.1x is deep and narrow, it illustrates the main CS concepts while learning basics of Python, which is a great language for novices. Python can be used nowadays in many areas of life even by non-CS major students. On the other hand, CS50x is broad and, yes, a little bit "shallow" because of the sheer amount of topics it discovers to the students. And may be CS50x is closer to the true "intro" course as it provides students with wider background knowledge of what is CS about.

And their combination was about "perfect" for me. After learning some C, I could look with different eyes at Python. And I strongly disagree with the advice to drop CS50x half-way the course after C, which is given in the Curriculum. It may discourage some students to proceed and to learn about a whole stack of web technologies in a very short period of time (yes, with some gnashing of tees and a lot of self-work). I was able to make an interactive web application as my final project at CS50x and was very proud of it :).

So, there is no one "perfect" intro course for all students in my point of view. But there could be a reasonably good combination of high quality courses that ideally use different teaching approaches. As I said, a combination of 600.1x + CS50x was about "perfect" intro in my case. Another student could prefer only one of them.

Therefore, I was surprised to see the CS50x has been moved from Intro to Systems. I also completed Nand2Tetris and Computer Networking this year - yes, they ARE Systems courses! Don't we have enough good courses in Core Systems? I don't see much value from all this "shuffling" of the courses around the Curriculum. It makes life harder for the students who already partly through the Curriculum, and the newcomers may also be in trouble as they are now temporarily without an intro course at all until 600.1x is available again.

joshmhanson commented 5 years ago

Hi @szaitseff, thank you for your feedback! Also CCing @bradleygrant and @waciumawanjohi

There is tremendous value for sure in CS50, and in learning about C generally. The main issue, as noted above and in many other issues in this repo, is that it is not level-appropriate for absolute novices. This is our target audience for the Intro CS section, and we have to be ruthless in our selection of the right materials.

The question of whether the first course should be focused on computation (via programming) or on surveying the entire field is something that I haven't been able to find solid answers on from computer science education specialists. I currently believe that the former is the better approach overall because hands-on work might get students interested faster than just learning about a bunch of concepts. And it is based on this assumption that I think the curriculum should be refactored to put CS50's introduction to modern technology in Foundations, leave discussion of low-level systems to another later section, and have Intro CS be focused on beginner-level computation.

At this point, I think MIT SICP and CS50 have been definitively ruled out for Intro CS. I believe that if someone evaluates MIT Intro to CS on the stringent Intro CS criteria I named above, that it would likely also get poor results due to not so much teaching Python, as merely assuming that you can quickly pick it up.

The jury is still out between GT Computing in Python, Berkeley 61A, and Systematic Program Design for Intro CS, and on whether there is still a place in our curriculum for CS50. Tentatively, I don't think Berkeley 61A will make the cut simply because it's based on SICP, upon which HtDP is known to be a substantial improvement for beginners. (Although I think SICP is still worth working through afterwards.)

I may have a solution that should reasonably satisfy most students as well as our own desire for a robust curriculum, it's just a little more work to put together. The fact that I want the next curriculum version to include multiple resources for each section so that students can use what is right for them means we don't have to adopt an all-or-nothing attitude.

We can make two sub-sections under Core Programming: Functional Computation and Imperative Computation, and in these, we integrate appropriate pieces of almost all the great material we're discussing here, since for many of them, level-appropriateness is the biggest problem.

Here's what it could look like:

Intro CS

Systematic Program Design

szaitseff commented 5 years ago

Thanks @hanjiexi !

It makes sense to envision and state the target audience when designing and discussing the Curriculum. Is it boys and girls from high school? Then 'How to code' intro would probably be ok. Or is it people educated in other fields outside CS? Then something more challenging with broader outlook would suit better. I personally disliked the very slow pace of 'How to code' with (almost) useless BSL and dropped it. If you target a wider public from the whole spectrum, there should be a wider choice of intro, not just rather specific 'How to code'.

joshmhanson commented 5 years ago

The upcoming Foundations (#531) top-level section is going to act as a fall-back curriculum for any student who needs additional background outside of computer science. It offers (very limited) support for those who haven't yet taken high school math, and moves on from there to fill in gaps as needed through physics and college-level math.

If a student were able to take Intro CS and Core CS without ever backtracking to Foundations, this would mean that they (at least) took calculus in high school, or that they have some limited college including calculus. So that should give you a good idea of what the main curriculum is targeting. (If we were really seriously targeting pre-college, we would include Bootstrap's materials which teach algebra through programming.)

We don't however make too many assumptions about whether a student has ever coded before. How to Code will likely be most appealing to absolute novices, but it is extremely relevant for those who took a so-called "computer science" class in high school. I took such classes — I would call them "coding", not CS. It will also have relevance for some experienced developers, but it can definitely be a bit boring for them.

The answer for those who think they are probably too advanced for any course, but aren't sure, is to go straight to the final course project and see if they can do it. If they can't, the course might have something for them. I myself was too advanced already for most of How to Code when I took it, but the design principles were something I'll always carry with me, and the final project was still a challenge.

(almost) useless BSL

Note that usefulness has different meanings in different contexts. BSL can't be used in the real world for the same reasons that it is exceptionally useful as a teaching language.

KathrynN commented 5 years ago

Controversial opinion ahead. I am a computer programmer, self-taught, and I love it. It's where I find my "flow". And How To Code: Simple Data is making me hate programming, let alone the course. I'm about half-way through, and I'm trying to persist as I've been told it gets better, or at least so I can give a more thorough review than just "it makes me want to quit", but I'm not sure it should be offered as a "universal beginning course".

tsarchghs commented 5 years ago

Controversial opinion ahead. I am a computer programmer, self-taught, and I love it. It's where I find my "flow". And How To Code: Simple Data is making me hate programming, let alone the course. I'm about half-way through, and I'm trying to persist as I've been told it gets better, or at least so I can give a more thorough review than just "it makes me want to quit", but I'm not sure it should be offered as a "universal beginning course".

+1, I started looking for alternatives almost immediately after taking it.

joshmhanson commented 5 years ago

Hi @KathrynN, thanks for your feedback! It's interesting how the same material produces such different reactions in people.

Given how you described yourself both here and in #516, it sounds like you may be well beyond taking an introductory course, and that the course therefore isn't right for your level. (If your concern about it is actually quite different, I suggest opening a separate issue about it, seeing as the course is currently part of Core Programming.)

So the key point I want to make is that it matters that SPD (HtC/HtDP) is designed for beginners. Beginners need to be taught in a way that is qualitatively different from how those with experience need to be taught.

Put simply, beginners need to be taught not just exactly what to do, but exactly how to do it. SPD fulfills this by providing the beginner with a set of recipes, where the problem statement determines the structure of the data used to solve it, and the student then matches the structure to the recipes, finding appropriate function templates to use for solving the problem. The amount of cognitive load needed for this is kept to a minimum that is believed appropriate for their level.

Later, these recipes get abstracted as students start to recognize patterns in the problem/data relations. Many of them turn out just to be simple functions like map, filter, and fold, which you use every single day in programming (perhaps in other forms).

In contrast, although more experienced students need to be told what to do, they shouldn't necessarily be told exactly how to do it. This gives them an appropriate level of challenge to maintain their interest. The cognitive load of programming itself is lower at this stage, making room for creativity and problem-solving.

It should therefore be unsurprising that students with programming experience would find a beginner-level course very unpleasant. Without extremely clever course design, it is impossible for one course to satisfy both beginners and everyone else.

I am open to suggestions about placing a non-beginner level Intro CS course alongside SPD as an alternative for those with prior experience, but many of the best such materials (e.g. SICP) might already be going into Core Programming.

KathrynN commented 5 years ago

Hi @hanjiexi - I'm afraid that isn't my issue with the course at all! I enjoyed going over the material in CS50, as it introduced a lot of new concepts, a language I'd always wanted to dabble in (C) and reinforced my thoughts around e.g. good design. Also I enjoy contributing to OSSU and I think it's hard to do so without going into all the courses. I will open a ticket with my thoughts on How To Code and why I think it teaches bad practices in coding, and does so in a way that I believe will put off beginners, and perhaps @gjergjk71 can contribute there. Sorry to pollute this thread!

joshmhanson commented 5 years ago

Per this comment, I'm re-thinking my approach to Intro CS.

Up until now, my thinking had evolved to the point of wanting two Intro CS courses, of which only one would be required based on background: one for absolute beginners (never coded before) and one for experienced beginners. So I was planning on introducing a new criterion — "beginner suitability" — to the evaluation criteria I mentioned above.

However reality is just more complicated than dividing the world of OSSU students into these two camps. Everyone is coming to OSSU with different backgrounds. And many of those backgrounds need an emotional hook to get them interested in the field. So that means there would need to be another evaluation criteria — like "emotional impact". CS50 probably scores highest on this due to the extremely physicality of the lectures...

So what I'm mulling over right now as a kind of compromise is to structure the curriculum such that Core CS doesn't make any assumptions about what the student may or may not have learned in Intro CS. All the "proper theory" stuff begins in Core CS, and Intro CS is more to get people motivated.

Intro CS, in this scenario, would contain a reasonable set of recommendations based on the community's experience. It wouldn't be one course, but shouldn't be 10, either. I'm envisioning about three course/material recommendations, each with paragraph(s) contributed from community members explaining why a particular course worked well for them. The student picks the intro course which appears most appropriate to their tastes, background, and interests.

This adds a little bit of complexity but I haven't thought of any major downsides that can't be worked around.

bradleygrant commented 5 years ago

I really like this approach. Having multiple entry points, or multiple "good first course"s, can potentially accommodate people attracted to OSSU for different reasons. You can also entirely sidestep the question of "what's the best first language" by (potentially) supporting introductory courses in (for instance) Python, C, and JS. The candidate can then self-select into the class they think they'll find most relevant in their near future, guided by the testimonials of those who have gone before.

I like this idea. I think it's worth iterating on.

bradleygrant commented 5 years ago

And then of course, once you've convinced them that

the hope is from that point, they'll have cultivated a taste for how to code correctly/systematically/intentionally, and see for themselves the value in working through the Core CS set.

anastasiosPou commented 5 years ago

Hello, I follow your discussions and I'm wondering when will the new curriculum is going to be ready. I want to be a computer scientist , I didn't manage to go to a university and when I saw the ossu it was very promising . I have a background in web development and I know is JavaScript and swift but I still lack the knowledge of the fundamentals like algorithms, data structures , computer architecture etc.

On Mon, Dec 10, 2018, 7:12 AM Bradley Grant <notifications@github.com wrote:

And then of course, once you've convinced them that

  • they CAN code,
  • programming helps them solve problems they have,
  • maybe it's fun,

the hope is from that point, they'll have cultivated a taste for how to code correctly/systematically/intentionally, and see for themselves the value in working through the Core CS set.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ossu/computer-science/issues/540#issuecomment-445688572, or mute the thread https://github.com/notifications/unsubscribe-auth/AVBvjAqwPULxDaUKWN5oGnF1VWY84tegks5u3e00gaJpZM4Yz55V .

joshmhanson commented 5 years ago

Hi @TasosPoursaitides, I am hoping to have the new version finished by the end of 2018, though it's a pretty ambitious goal given how little time is left. However, it's not completely new, but rather it's building on what we have now. The algorithms courses aren't going anywhere and neither are the courses they depend on (Calculus 1 + Math for CS), so I encourage you to dive right in. No need to wait on a new curriculum version.

waciumawanjohi commented 4 years ago

Lots of great thought here. It has led to an RFC on adding Intro to Programming options to the beginning of the curriculum. Comments welcome! https://github.com/ossu/computer-science/issues/589