Closed huguesdevimeux closed 3 years ago
I agree it's not the most elegant piece of the code. I'd love to hear some suggestions here. I have some experience with the configparser module from the standard library.
A related issue is the constants.py
file. Both this file and the CONFIG dict are basically ways of tracking hard-coded options. One nice-to-have would be to expose this options and allow the user to change or visualize them more easily. This would be achieved by the config files used by configparser.
The CONFIG and constants.py issues can be handled separately, although we should do both. One of the confusing things about constants.py is that some of the variables defined there aren't actually constants (e.g. the media directories).
As for the CONFIG variables in each class I think it'd be cleaner to just add them as keyword arguments. We could leave the final **kwargs argument to accept additional config variables for backwards compatibility. For example Circle.__init__ here would be something like
def __init__(self, color=RED, close_new_points=True, anchors_span_full_range=False, **kwargs):
self.color = color
self.close_new_points = close_new_points
self.anchors_span_full_range = anchors_span_full_range
self.__dict__.update(kwargs)
Is it always the case that doing self.CONFIG['key']
would be equivalent to doing self.key
instead? for example, are the keys of the CONFIG dictionary iterated over?
Another thing to think about is: which of the things currently stored in CONFIG should be turned into properties.
Is it always the case that doing self.CONFIG['key'] would be equivalent to doing self.key instead? for example, are the keys of the CONFIG dictionary iterated over?
Yes, but the code is pretty complicated. https://github.com/ManimCommunity/manim/blob/master/manimlib/utils/config_ops.py#L23
If that's the case, then I'd vote against converting each key of CONFIG into a member of the class. Because once they're just another member, there is no way of telling apart whether a member is supposed to be one that would go in CONFIG. Said another way, you can't iterate over only the members that are supposed to be CONFIGs without iterating over a bunch of members that are not.
If that's the case, then I'd vote against converting each key of CONFIG into a member of the class. Because once they're just another member, there is no way of telling apart whether a member is supposed to be one that would go in CONFIG. Said another way, you can't iterate over only the members that are supposed to be CONFIGs without iterating over a bunch of members that are not.
use _slots_\ for this
Oooooooooooooh good one
As someone who's most familiar with C#, the CONFIG is a big mess to me. There's no indication of which parameters are available and which ones aren't. kwargs
has the same issue. This may just be a gripe with Python in general, but I think Manim is complicated enough without introducing unknown parameters.
I prefer @eulertour's proposal
As for the CONFIG variables in each class I think it'd be cleaner to just add them as keyword arguments. We could leave the final **kwargs argument to accept additional config variables for backwards compatibility.
If that's the case, then I'd vote against converting each key of CONFIG into a member of the class. Because once they're just another member, there is no way of telling apart whether a member is supposed to be one that would go in CONFIG. Said another way, you can't iterate over only the members that are supposed to be CONFIGs without iterating over a bunch of members that are not.
@leotrs There is no difference between CONFIG variables and instance variables. They're one and the same.
@leotrs There is no difference between CONFIG variables and instance variables. They're one and the same.
Well shit. Let's use kwargs and be done with it.
There is a difference that is purely for readability purposes. As @yoshiask said, it put unknows parameters, which is not good in a quite complex library as Manim. I’m completely in favor of @eulertour ‘point.
In my opinion, the usage of __slots__ makes it pretty clear to anyone who reads the code which attributes the class may have. So it will turn the code even more readable!
(Also note that __slots__ has memory advantages when many instances of a class are created, but this shouldn't be the main argument since it's pretty much irrelevant most of the time... but any improvement in this sense is welcome, of course)
I'm leaning toward __slots__
now.
An update on that.
As discussed on Discord, we should use a dataclass model for our classes, using packages such as dataclasses
(from stdlib) and attrs
.
Sample code:
from dataclasses import dataclass
@dataclass
class MyCoolClass:
a: int = 5
b: int = 6
# then
obj = MyCoolClass(a=5, b=6)
# or
obj = MyCoolClass(5, 6)
Note that it automatically generates an __init__
method with each attribute as a parameter (which can be used as a kwarg or not).
Also note that, with the attrs
package, the code would be the exact same. The only thing that would change would be the name of the decorator:
import attr
@attr.s
class MyCoolClass:
a: int = 5
b: int = 6
# everything else is the same.
Pros of using attrs
over dataclasses
:
dataclasses
by using the __postinit__
method;dataclasses
, one would need to convert using __postinit__
, which means there would be a timespan during which the attribute would have the wrong type, and only on the __postinit__
call would this be corrected. (Of course, it's not like said timespan is too big though, so we can mostly ignore this)__slots__
: Really, just write @attr.s(slots=True)
and it creates the __slots__
, which gives a memory performance boost when many instances of a class are created. (Otherwise, though, it's a mostly invisible difference)Cons of using attrs
:
What do y'all think?
^Also, by the way: Worth noting that, if we actually implement this, this could be part of a major manim rework, i.e., working on this would imply reorganizing the code and whatnot. So we must keep this in mind.
This would likely be positive, of course. But we will need cooperation.
I like the idea of using dataclasses/attrs. I am usually partial to staying within stdlib, but since dataclasses were modeled directly after attrs, I think I'm OK with using attrs.
@yoshiask what do you think about this?
Regarding this being part of a major rework. I propose this: whoever ends up working on this would need to touch a number of files (and should probably wait until we have some minimal tests in place #22), but I suggest that this PR does not include any refactoring beyond changing CONFIG for attrs/dataclasses. In writing this PR, the author will have a better view of what refactoring needs to be done in the future, but we can leave that for later, I think.
Dataclasses and attrs looks pretty good to me! Unfortunately I can't do it now, but I can start switching everything over to our new structure towards the end of the month, and hopefully by then we'll have CI and stuff set up.
I think we have consensus among everyone who's participated here, except for @eulertour.
Dataclasses and attrs looks pretty good to me! Unfortunately I can't do it now, but I can start switching everything over to our new structure towards the end of the month, and hopefully by then we'll have CI and stuff set up.
I'd be more than willing to help here as well! I'll assign our names in the issue for now.
Do dataclasses/attrs allow for positional arguments to __init__
and for optional **kwargs
for example what's used here? I'm reluctant to make such a large change that breaks old code.
Yes, if you declare the variable with a type but don't assign it a value.
from dataclasses import dataclass
@dataclass
class A:
a:int
b:str
c:list = []
This is the same as:
class A:
def __init__(self,a:int,b:str,c:list=[])
Only when declared with a type? Would this make some of our code have type hints while the rest doesn't?
In python you can only create unassigned variable by providing a type I think. Also, we can always keep the type to be Any
if we don't know it.
Right. I think I would have to see what manim looks like using dataclasses/attrs before actually committing to its usage. So maybe like, someone can volunteer to take one of the smaller/simpler files and rewrite it using dataclasses. That way we can have concrete pros and cons. Of course this is just a dream, I'm not suggesting that someone spends time on a PR that may or may not be merged.
Alternatively, we just need to think real hard about how other parts of the code will be affected by using dataclasses/attrs. (And I wouldn't think of doing this until we have at least a few tests in place...)
@kilacoda I see that that covers positional arguments but does it cover kwargs as well?
I like the idea of using dataclasses/attrs. I am usually partial to staying within stdlib, but since dataclasses were modeled directly after attrs, I think I'm OK with using attrs.
I'd like to follow this course of action: we develop everything using only the stdlib features (dataclasses
). Then, if we do reach a point where we do need attrs
, we can switch to it. Also, we may compare memory usage with and without attrs
+ slots=True
, in order to determine if it is worth it.
Do dataclasses/attrs allow for positional arguments to
__init__
and for optional**kwargs
for example what's used here? I'm reluctant to make such a large change that breaks old code.
For the optional kwargs, one can use inheritance for this: the class should inherit the attributes of its parents. This way we have more type-strict (i.e. easier to typecheck, not actually "strict" because this is python) classes and more predictable models. If it depends on unpredictable parameters, imo it's bad design
we develop everything using only the stdlib features (dataclasses). Then, if we do reach a point where we do need attrs, we can switch to it.
I'm afraid that if we design/develop everything around dataclasses and get used to those, we might never really feel the need to move on to attrs because, well, we designed it without that need. We may never know what we're missing. It's the problem of unknown unknowns you see.
Okay well I guess we can do the reverse then? No idea. I mean, it's not like attrs
has a ton of advantages, but... Well, let's use it then. If we ever feel like changing our mind... we can discuss that later
In python you can only create unassigned variable by providing a type I think. Also, we can always keep the type to be
Any
if we don't know it.
I'm actually quite familiar with the data types of many of the variables in Manim (because I had to find a way to project them to C#)
At this point my personal opinion is that @PgBiel and @yoshiask should determine which library to use (between attrs and dataclasses). Why they? Because they assigned this issue to themselves. And I don't see anyone else strongly objecting to either option.
As long as @eulertour's questions have been answered, that is.
My only request is that whatever we settle on, it should still be a drop-in replacement. I wouldn't want to get rid of the CONFIG vars only to introduce another quirk that people still don't understand.
At this point my personal opinion is that @PgBiel and @yoshiask should determine which library to use (between attrs and dataclasses). Why they? Because they assigned this issue to themselves. And I don't see anyone else strongly objecting to either option.
I say we go with attrs for now; if needed, we can change our mind later, without any costs to it.
My only request is that whatever we settle on, it should still be a drop-in replacement. I wouldn't want to get rid of the CONFIG vars only to introduce another quirk that people still don't understand.
They (dataclasses) work directly with python attributes. That is, they work very similarly to CONFIG, but using attributes directly (without "digest_config") and allowing for more customization and better type interpretation by IDEs.
@PgBiel looking forward to reviewing that PR! Something tells me to expect a bit more discussion there, but at least we have settled on which library to use.
Hi, I actually posted my idea about CONFIG in discord sometime ago, the message is so long that you can go to here to see it if you like... https://discord.com/channels/581738731934056449/581738732646957057/701035356950364221
The main idea is to use metaclass
, but after seeing the solution of using dataclasses/attrs
, I think this is probably a good way(at least better than mine). And my most concern is whether the attributes defined by dataclasses/attrs
can be recognized by IDE properly and give the correct auto-completion, because as a non-native English speaker, I can't remember those attributes' name correctly...
Also, I have changed (almost) all the CONIG of mobject into attributes in here sometime ago. https://github.com/xy-23/manim/tree/dev
The main idea is to use
metaclass
, but after seeing the solution of usingdataclasses/attrs
, I think this is probably a good way(at least better than mine).
Your metaclass is basically the attr.s
and dataclass
decorators, just with less features... so you pretty much converge with our plan
And my most concern is whether the attributes defined by
dataclasses/attrs
can be recognized by IDE properly and give the correct auto-completion, because as a non-native English speaker, I can't remember those attributes' name correctly...
Yes
Also, I have changed (almost) all the CONIG of mobject into attributes in here sometime ago. https://github.com/xy-23/manim/tree/dev
I'll take a look!
Note: work on this should begin after we are done with all other "Initial Cleanup" issues
Edit: although, work on documentation would certainly be useful here, so might want to postpone until then? Idk (leave your opinions below)
(cc @leotrs @eulertour @XorUnison)
Can you be more specific? Do you mean you wish the CONFIG stuff was documented before changing it? If so, I suggest not waiting for documentation. Instead, I suggest writing tests for the CONFIG stuff first, and make sure that the dataclass/attr rewrite passes the tests.
I mean it gets much cleaner if we know the types beforehand... I think all the three projects - dataclasses, typings and documentation - are all kind of connected, and working on one implies on working on all of them
Agreed. And that's exactly why we can't wait until one is done to start the other two. I suggest you delineate what kind of documentation you need for this project and work on it at the same time, on the same PR. You can confer with the people in charge of the docs branch so there are no conflicts.
This is now the only standing issue in the "initial cleanup milestone". @PgBiel, I don't mean to rush you, but do you have any updates? :)
I'll try to start work on it tomorrow, but I gotta note that I'm gonna be quite busy (especially on Friday) but, again, I'll see what I can do
Once this is finished and merged, I'll start working on #31
Now that #98 has been merged, it has deprecated/eliminated a bunch of the keys in the CONFIG dictionaries in classes across the codebase. It'd be great to have an update here on what still needs to be done.
Is dataclassess introduced in python 3.7 right? Then, doing this would mean that manim won't with python 3.6. If I am right, still Ubuntu Linux has Python 3.6 as default and the latest version of python. Isn't that a problem?
You raise a good point. @PgBiel thoughts?
We're using attrs
though and not dataclasses
.
correct, kilacoda. We're not going to use dataclasses
, but attrs
, which supports most python versions, including python 2
We have discussed about the variable
CONFIG
previously on discord. What should we do about it ?I personally think we should modify at least a little bit of its architecture. it alters the readability a lot. One problem with that would be that in changing
CONFIG
we would have to review the entire architecture of the code.What do you think?