lf1-io / padl

Functional deep learning
Apache License 2.0
106 stars 4 forks source link

403 variables #409

Closed wuhu closed 2 years ago

wuhu commented 2 years ago

Fixes #403

From new docs:

Using padl.param one can specify parameters which can then be overridden when loading files.

from padl import transform, param, load

x = param(1, name='x', description='add this much')

@transform
def f(y):
  return x + f

save(f, 'f.padl')

When loading a Transform that uses a parameter, one can specify the value of the parameter as a keyword argument to padl.load:

>>> from padl import load
>>> f = load('f.padl', x=1000)
>>> f(1)
1001

Params usually have a default value. One can force the user to provide them with use_default=False.

jasonkhadka commented 2 years ago

MIssing tests.

jasonkhadka commented 2 years ago

I find this behaviour confusing.

factor = param(2, name='x', description='multiply this much')

@padl.transform
def plus(*a):
    return sum(list(a))

@padl.transform
class multiply:
    def __init__(self, factor):
        self.x = factor

    def __call__(self, arg):
        return arg*self.x

@padl.transform
def plus_factor(x):
    return x+factor
m1 = multiply(factor)
m2 = multiply(factor)
t = plus_factor >> m1 + m2 >> plus

save(t, 't.padl', force_overwrite=True)

t_load = load('t.padl', factor=20)
t_load(1)
12

t_load

Compose:

      │
      ▼ x
   0: plus_factor       
      └────────────────────┐
      │                    │
      ▼ arg                ▼ arg
   1: multiply(factor=2) + multiply(factor=2)
      │
      ▼ a
   2: plus  

t_load2 = load('t.padl', x=20)
t_load2(1) 
840

t_load2
Compose:

      │
      ▼ x
   0: plus_factor        
      └─────────────────────┐
      │                     │
      ▼ arg                 ▼ arg
   1: multiply(factor=20) + multiply(factor=20)
      │
      ▼ a
   2: plus      

In the printout, it shows as factor (the variable name) but while loading you have to load it with x (param name)

wuhu commented 2 years ago

@jasonkhadka thanks for the review.

On your last comment:

The printout shows the keyword argument name of multiply's __init__, not the variable name. We cannot make the param use that because we want to be able to use a param in more than one place:

x = param(123, 'x')

@transform
class A:
    def __init__(self, kwarg_of_A):
        [...]

@transform
class add:
    def __init__(self, kwarg_of_B):
        [...]

t = (
    A(x) >> B(x)
)

save(b, ...)

Another option would be to use the variable name instead of having to give a name (x = param(123) would have x), but that involves some trickery with the potential of new bugs. Also it would mean that one could only use param in the context of a variable assignment.

Do you have another suggestion what we could do about this?

wuhu commented 2 years ago

Added tests.

codecov-commenter commented 2 years ago

Codecov Report

Merging #409 (d06c522) into main (eea9b28) will increase coverage by 0.07%. The diff coverage is 96.96%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #409      +/-   ##
==========================================
+ Coverage   87.35%   87.43%   +0.07%     
==========================================
  Files          15       15              
  Lines        3014     3041      +27     
==========================================
+ Hits         2633     2659      +26     
- Misses        381      382       +1     
Impacted Files Coverage Δ
padl/dumptools/symfinder.py 84.69% <ø> (ø)
padl/dumptools/var2mod.py 92.76% <50.00%> (-0.23%) :arrow_down:
padl/__init__.py 100.00% <100.00%> (ø)
padl/dumptools/serialize.py 91.20% <100.00%> (+2.16%) :arrow_up:
padl/dumptools/sourceget.py 86.88% <100.00%> (ø)
padl/transforms.py 90.95% <100.00%> (+0.04%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update eea9b28...d06c522. Read the comment docs.

jasonkhadka commented 2 years ago

Thinking about it, I think this should fail:

t_load = load('t.padl', factor=20)

as param is defined by

factor = param(2, name='x', description='multiply this much')

Currently, the behaviour is hiding the problem with load(...,factor=20). I am NOT setting the correct value to the intended parameter, and i am also NOT becoming aware of that.

If we get error like:

t_load = load('t.padl', factor=20)
ERROR: Parameter 'factor' does not exist. t.padl contains parameters: (`x`, default_value=2)

I think this should be enough to aware of people about the difference between the variable name and parameter name.

@jasonkhadka thanks for the review.

On your last comment:

The printout shows the keyword argument name of multiply's __init__, not the variable name. We cannot make the param use that because we want to be able to use a param in more than one place:

x = param(123, 'x')

@transform
class A:
    def __init__(self, kwarg_of_A):
        [...]

@transform
class add:
    def __init__(self, kwarg_of_B):
        [...]

t = (
    A(x) >> B(x)
)

save(b, ...)

Another option would be to use the variable name instead of having to give a name (x = param(123) would have x), but that involves some trickery with the potential of new bugs. Also it would mean that one could only use param in the context of a variable assignment.

Do you have another suggestion what we could do about this?

wuhu commented 2 years ago

If we get error like:

t_load = load('t.padl', factor=20)
ERROR: Parameter 'factor' does not exist. t.padl contains parameters: (`x`, default_value=2)

I think this should be enough to aware of people about the difference between the variable name and parameter name.

good idea, I'll add that.