powerlang / egg

Egg Smalltalk
MIT License
13 stars 2 forks source link

Proposal: Rename OrderedCollection to List, Dictionary to Map #10

Open melkyades opened 4 years ago

melkyades commented 4 years ago

Module system should help with compatibility (letting refs to Dictionary and OrderedCollection continue to work)

gbracha commented 3 years ago

Well, FWIW, those are the names we use in Newspeak. And for us, it's easy to rename things for local use, as every module has its own namespace. But YMMV.

janvrany commented 3 years ago

@gbracha yes, in some of our internal talks I expressed a wish to have something not unlike NewSpeak modules and imports, unifying the concept of module (package) / class / nested class / shared pool. I hope we'll have time to get to this "soonish"...

gbracha commented 3 years ago

Thanks. Let me know if you ever wish to discuss.

melkyades commented 3 years ago

Well, FWIW, those are the names we use in Newspeak. And for us, it's easy to rename things for local use, as every module has its own namespace. But YMMV.

I didn't know that, so I'm glad I'm not alone.

Thanks. Let me know if you ever wish to discuss.

For sure, it would be great to talk about this. My caveat about Jan's suggestion is that classes already have too many responsibilities: creating instances and defining their behavior; so adding more to them may cause more problems than what it would solve. What I'd like is to let them define namespaces, but indirectly (through some ivar?). I'd also like to think about dividing them so that shape gets independent from behavior.

melkyades commented 3 years ago

The other caveat is that the DMR introduces some restrictions: when things get more late bound, they may cause an infinite recursion on the DMR that wouldn't occur on a classic VM

gbracha commented 3 years ago

So classes in Newspeak still have the same responsibilities: creating instances and defining behavior (though you could argue that it is mixins that define behavior). It's just that instances now cover everything - scoping is now covered by method lookup, so things are simpler. And because everything now goes via a message passing interface, behavior is independent of representation (shape). But these are generalities, and in any case I'm not here to convince you to change Smalltalk.

The real tradeoffs are different. You pay for the encapsulation Newspeak gives you. You can't just grab everything from a global namespace. And once you add access control, you also find that you can't always get at what you want that easily. Whitebox testing is a good example. Extension methods are another (assuming you think they are a good thing).

In short, the security/modularity discipline does sacrifice some convenience. The benefits come at some cost. Just as mirrors are a bit less convenient than accessing classes and method tables directly, but enable many things that are just don't work without them.

You can mitigate these costs during development, but that takes investment. For example, we had to change the class browser UI model, because of nesting.

I would like to hear more about your point about late binding being a problem on the DMR. In general, I'm wondering how much work it would be to define a Newspeak DMR using powerlang.

melkyades commented 3 years ago

creating instances and defining behavior (though you could argue that it is mixins that define behavior)

actually in Bee the behavior is defined indirectly too (each class has its own Behavior, which points to the method dictionary and the next Behavior -of the superclass-). One of the questions I often ask myself is, could we have shape separated from the class hierarchy? would it be nice or ugly? (maybe in newspeak it's like that?) specially considering a desire of having everything late bound. But yes, I guess this would be a quite different system from Smalltalk. The "incremental improvement" I've talked about with Jan is to at least have all dictionary-based vars late bound: globals, pools and class vars. Basically, if some identifier is not an instance variable, then it is late bound. Maybe in the end it leads to "ok, why where ivars different? let's make them late bound too".

Extension methods are another (assuming you think they are a good thing). This is another whole topic. To me, it's not whether they are good or bad, but whether there's a nice way to live without them or have some thing as powerful with so little effort.

I would like to hear more about your point about late binding being a problem on the DMR. In general, I'm wondering how much work it would be to define a Newspeak DMR using powerlang.

Having a system defined in itself leads to some chicken-and-egg problems that have to be shortcircuited. For example, in the DMR, the lookup methods are precomputed, and nativized with a special linker so that message sends in this closure can be emitted as invokes (as lookup methods form a small closure, you can know who will be the receiver of each send and know the method to be executed). Now, if you make name binding lazy, it is very likely that your name lookup will also require this same form of shortcircuiting. It seems totally doable, but it's also likely that it might require some more work than it looks in the first place (I'm worried/interested in the interaction bewteen shortcircuiting method lookup and shortcircuiting name lookup). And then regarding the Newspeak & DMR, I see quite some options, all of which include AOTing a small kernel and a JIT. Easier ones could be Bee kernel & bee jit, or bee kernel & cog jit (and on top of any of those load Newspeak); or in the "turtles-all-the-way-down", Newspeak kernel & Newspeak JIT.

shingarov commented 3 years ago

in the DMR, the lookup methods are precomputed, and nativized with a special linker so that message sends in this closure can be emitted as invokes (as lookup methods form a small closure, you can know who will be the receiver of each send and know the method to be executed). Now, if you make name binding lazy, it is very likely that your name lookup will also require this same form of shortcircuiting.

To my ears, this sounds like screaming "JSR292" with its set of interesting situations (what if an exception occurs during custom lookup). Shijie Xu at UNB recently published a dissertation on optimization of Method Handles in a J9 implementation. It would be interesting to compare his approach to DMR.

gbracha commented 3 years ago

So name lookup and method lookup in Newspeak are the same thing in theory. In practice, access to temporaries is not really implemented as a lookup. If one writes the implementation in Bee, I don't think much changes then. If one goes for the turtles-all-the-way-down solution (which is my goal in the end), it seems to me that the difference in Newspeak would be that any instance variable access in the lookup methods would have to be treated as you would treat a method lookup. This might also include access to enclosing objects. I would perhaps do this in phases, first in Bee and the in Newspeak; but I'd have to understand a lot more about the process before deciding.

shingarov commented 8 months ago

As this thread has now been dug out of inactivity, gave some more thinking to the question and now I am curious. What do we mean by renaming OrderedCollection to List? To me, both words seem to have settled meanings and those meanings are quite different, no?

OrderedCollection is a memory-mutating structure, typically defined by some implementation; said implementation typically comprises a backing Array and other things (indices). The imperative operations of OrderedCollection mutate the Array, the indices, sometimes allocate new Arrays and/or make existing Arrays garbage.

By 'List` most authors usually mean a functional structure, commonly defined algebraically, as a sum type of Empty List (isomorphic to void) and Non-empty List (a product of some A and List). I am almost tempted, inspired by https://github.com/shingarov/MachineArithmetic/issues/164 , to write something like

SequenceableCollection subclass: #List
  ...

and

List subclass: #EmptyList
  instanceVariableNames: ''
  ...

and

List subclass: #NonEmptyList
  instanceVariableNames: 'head tail'
  ...

It would be interesting to prove that OrderedCollection is in some sense "the same" as List, thus gaining insight into in which sense exactly. They can't literally be isomorphic, because one changes state and the other one is pure; but it would be very interesting to axiomatize OrderedCollection as a state transformer and show some correspondence between its behaviour (in the finite case at least) and that of List (perhaps they are Galois-connected? or adjunct?). I would expect such exercise to probably uncover juicy insights into nuances of what OrderedCollection does. I would further expect it to find fascinating bugs.

It becomes more interesting in the infinite case. After reversing the arrows, List generalizes to subsume both finite lists and infinite streams; moreover, it is able to represent any "steps of a computation" because of the terminal property. What is an infinite OrderedCollection? Does OrderedCollection also enjoy being terminal (in some category more appropriate for imperative computation)? If not, is there a "better" one?

jpimas commented 8 months ago

The meaning of this rename is much more mundane. The name OrderedCollection is clunky, too long, specially for such a frequent kind of object. The idea is to keep the exact same implementation of OrderedCollection, which I guess is the same than Python's List or Java's ArrayList. With Egg namespaces, it should be possible to keep OrderedCollection working, or to redefine List to other things in a particular context.