Closed temyurchenko closed 1 year ago
The argument to X86ProgramDefs
should have type list[FunctionDef]
, where FunctionDef
comes from Python's ast
module.
The argument to IndirectJump
should have type location
(could be Variable
or Reg
)
The two arguments to TailJump
should have types location
and int
respectively.
The argument to Global
should be a str
The argument to
Global
should be astr
If Global's argument is a string, why does it explicitly do str(self.name)
?
I don't know why Global
's str
method was written that way, but the constructor for Global
is always passed a string. I'm also certain str('hello')
is just 'hello'
I don't know why
Global
'sstr
method was written that way, but the constructor forGlobal
is always passed a string. I'm also certainstr('hello')
is just'hello'
True. Just was wondering if it was ever intended be used on more than just strings
I don't know why
Global
'sstr
method was written that way, but the constructor forGlobal
is always passed a string. I'm also certainstr('hello')
is just'hello'
Anyway, everything is ready then. I also have another PR that is fixing a bug a bit complicating the homework.
Did you possibly get enough time, @Fyrbll?
Thanks for the reminder. Revisiting this now.
IndirectCallq
is annotated with type str
. Pycharm correctly suggested that it should be an arg
. Essentially it's going to be a variable initially that's the target of an leaq
instruction. This variable might get a register or a stack location as a home eventually. (In patch-instructions
we will make sure the immediate target of this leaq
is a register after which we will move the value to its true home).The two arguments to TailJump should have types location and int respectively.
Although it is true that in this course we always make the argument of TailJmp
the register %rax
, people might use a function with return type arg
to process its argument. To keep things flexible I guess we should change the first argument's type to arg
. Sorry about that.
EDIT. Formatting of quote.
The first argument to
IndirectCallq
is annotated with typestr
. Pycharm correctly suggested that it should be anarg
. Essentially it's going to be a variable initially that's the target of anleaq
instruction. This variable might get a register or a stack location as a home eventually. (Inpatch-instructions
we will make sure the immediate target of thisleaq
is a register after which we will move the value to its true home).
- I was too hasty when I said earlier that
The two arguments to TailJump should have types location and int respectively.
Although it is true that in this course we always make the argument of
TailJmp
the register%rax
, people might use a function with return typearg
to process its argument. To keep things flexible I guess we should change the first argument's type toarg
. Sorry about that.EDIT. Formatting of quote.
Done
Thanks for seeing those updates through.
I want to suggest these additional changes that ensure x86_ast.py
passes all our test cases.
X86Program
Don't set frozen=True
for X86Program
. This is because we might want to "attach" extra information to a program between passes. Maybe we compute some property my_prop
of a program p
in some pass which will be used in a later pass, we'd like to write p.my_prop = ...
X86ProgramDefs
Don't set frozen=True
for X86ProgramDefs
for the same reason.
Instr
Change the definition of the class Instr
so that Python doesn't complain that field args
, which is a list, is unhashable.
@dataclass(frozen=True)
class Instr(instr):
...
def __init__(self):
return id(self)
Anything that is a Subclass of Instr
To ensure we don't break anything we want to somehow retain Instr
's __hash__
unless otherwise specified in the class definition in the previous version of ast.py
.
For example to pass the existing tests we can subsequently change Callq
to call its parent's __hash__
@dataclass(frozen=True)
class Callq(instr):
...
def __hash__(self):
return super.__hash__(self)
I also mentioned "unless otherwise specified ... previous version of ast.py
". One example of this is Variable
@dataclass(frozen=True)
class Variable(location):
id: str
def __str__(self):
return self.id
def __hash__(self):
return hash(self.id)
Just for reference here's a minimal sequence of changes that will allow the reference compiler to pass all tests.
-@dataclass(frozen=True)
+@dataclass
class X86Program:
...
@dataclass(frozen=True)
class Instr(instr):
instr: str
args: List[arg]
def source(self):
return self.args[0]
def target(self):
return self.args[-1]
def __str__(self):
return indent_stmt() + self.instr + ' ' + ', '.join(str(a) for a in self.args) + '\n'
+ def __hash__(self):
+ return id(self)
@dataclass(frozen=True)
class Callq(instr):
func: str
num_args: int
def __str__(self):
return indent_stmt() + 'callq' + ' ' + self.func + '\n'
+ def __hash__(self):
+ return super.__hash__(self)
To provide a better answer I will need to examine why the "default" dataclass-derived hash function for Callq
causes wacky results.
X86Program
Don't set
frozen=True
forX86Program
. This is because we might want to "attach" extra information to a program between passes. Maybe we compute some propertymy_prop
of a programp
in some pass which will be used in a later pass, we'd like to writep.my_prop = ...
I'm going to do that, but just to see if feasible, it seems that the more common approach here is storing these additional properties in a hash-table, rather than monkey-patching? Would that work, or would it require more effort to further adapt?
Instr
Change the definition of the class
Instr
so that Python doesn't complain that fieldargs
, which is a list, is unhashable.
That is not the best definition of hash
, because it conflicts with __eq__
. We'll need to redefine __eq__
as well, but ideally, change args
to be a tuple. I assume the latter approach would cause issues with the existing code?
For example to pass the existing tests we can subsequently change
Callq
to call its parent's__hash__
It's indeed interesting why it's required. Could it be because of the hash
/__eq__
divergence describe above?
I also mentioned "unless otherwise specified ... previous version of
ast.py
". One example of this isVariable
I'm not sure I completely get it. Do tests rely on a particular implementation of hash
, so new classes should all have the same __hash__
as the previous ones?
@Fyrbll could you please check if the latest commit fixes the issues you mentioned?
ping
This commit fixes the issues I mentioned and passes all tests. It should be fine to merge.
X86Program
and X86ProgramDefs
do not have frozen
set to True
anymore.
Artem changed the constructor of Instr
to accept any Iterable
, such as a list. As mentioned above, Python's lists are not hashable. The iterable passed to the constructor is stored as a tuple. Tuples are hashable. This removes the need to make __hash__
the same as id
for Instr
. However just because Instr
is hashable doesn't mean it's using a custom hash function. See the next point.
In the latest commit, subclasses of instr
(including Instr
) have frozen
set to True
and eq
set to False
. Python documentation on dataclasses says
If eq and frozen are both true, by default dataclass() will generate a hash() method for you. If eq is true and frozen is false, hash() will be set to None, marking it unhashable (which it is, since it is mutable). If eq is false, hash() will be left untouched meaning the hash() method of the superclass will be used (if the superclass is object, this means it will fall back to id-based hashing).
Since instr
uses id-based hashing, all its subclasses should use id-based hashing as well. This should include Instr
. In fact, changing the definition of Instr
to the one in Artem's previous commit also passes all tests, so the whole tuple-as-argument situation isn't really necessary in my opinion, but it's also a neat trick so I don't want to get rid of the code.
@murcake do I have the right idea about the Instr
class?
X86Program
andX86ProgramDefs
do not havefrozen
set toTrue
anymore.
I don't frankly remember why it was done. My guess is that the classes uses mutable dicts and lists, so I thought that it was designed to be mutable, so it shouldn't be frozen. If you think this guess is valid, let's leave it that way, if not—I'll reintroduce frozenness.
changing the definition of
Instr
to the one in Artem's previous commit also passes all tests, so the whole tuple-as-argument situation isn't really necessary in my opinion
The main reasoning I believe, as above, was that Instr
should be immutable. Thus, its args
should be represented as a tuple. Thus, we need that weird bit in the constructor to accommodate list arguments as per book.
What to actually do with it depends on your input on if my view on mutability is correct here.
@murcake do I have the right idea about the
Instr
class?
Yes!
I don't frankly remember why it was done. My guess is that the classes uses mutable dicts and lists, so I thought that it was designed to be mutable, so it shouldn't be frozen. If you think this guess is valid, let's leave it that way, if not—I'll reintroduce frozenness.
Yes, I meant that they should not be frozen and I wanted to record that you made this change. Apologies for the confusion.
The main reasoning I believe, as above, was that Instr should be immutable. Thus, its args should be represented as a tuple. Thus, we need that weird bit in the constructor to accommodate list arguments as per book. What to actually do with it depends on your input on if my view on mutability is correct here.
I get it now. Thanks for the comment about immutability. I was only thinking in terms of __hash__
I haven't turned some classes to dataclasses, mainly because I don't know what should be the types of their arguments. It would be very helpful if someone pointed them out.
The classes in question are:
X86ProgramDefs
,IndirectJump
,TailJump
,Global