Open veni-vidi-vici-dormivi opened 2 days ago
Thanks for getting this discussion started. For me it would be important to understand the requirements better & I try to do this by going over the code / writing tests. Some random thoughts and questions:
eval
and exec
calls which would make me happy :smile:loc=c1, scale=c1
)?Should coeffs be an array? That would align with the structure of the minimize
function.
def loc_func(c, x):
return c[0] + c[1] * x
def shape_func(c):
return c[2]
A problem with going away from the expression as a string could be that it becomes harder to save the expression. I think it is important to save the expression in some way, only saving the parameters won't get you anywhere if you lost the expression, but we could handle that by still setting a name and asking the user to put all relevant information there. Then we could save the name as an attribute with the parameters.
Should coeffs be an array?
I like that idea, then we could get rid of the signature
call and saving could also be more flexible.
Can it be that we want to have the same coeff for two params
That is possible for the structure I gave above, I'm just not sure how the optimization would handle that...
I see the appeal of this solution, but I see there a major problem. In summary, it clashes with the objective of Expression
, and future applications will be hindered by the need to provide all functions along.
The whole objective of the class Expression was to finally be capable of doing all distributions and all expressions on parameters. To have full flexibility.
With this alternative, the user will have to define its functions and pass them as arguments. We have to make sure that the same functions are used for training AND emulation. Thus pass the functions to the users. Thus have a script with all potential functions. Which is precisely what i wanted to avoid, being constrained to predefined functions. In my experience, there will always be new ideas, new expressions. So, every single time, we would have to add new functions to the script with all predefined functions. I sincerely fear that it will be messy to have them all.
So, we have to choose, do we pass directly the functions and lose the full flexibility, or do we stick to the strings, with the fix on the parsing? My personal opinion is for the full flexibility, because of how much time i've spent in this code, because of future developments & applications. But again, I see the appeal of your solution.
One extra point: the object returned by the alternative class is the .rvs
object. It should be a scipy,.stats distribution, so that we can call anything. I've already to call logpdf
, logpmf
, cdf
, sf
, isf
& support
, and for plots, even more.
On the other points raised by @mathause:
exec
& eval
: yes, required with the original Expression
, but unless i'm mistaken, they are used correctly here. So even if these functions are dangerous, it should be fine. Hopefully?Of course, it requires some understanding of the string as input, with the parsing failing with extra commas (but potentially easy fix). It also requires
I was thinking why we do not actually pass functions as arguments instead of a strings to the Expression class. I was thinking something like this:
And then feeding the functions to the optimization routines like we would after parsing? This is not incredibly thought through yet but it might be a way around the parsing.