Open songjhaha opened 2 years ago
vjp
function like this:
struct PaddleModuleWrapper
NN::PyObject
end
function vjp_wrt_params_and_args(nn::PyObject, pyargs...; kwargs...) res = nn(pyargs...; kwargs...) pyparams = nn.parameters() paramslen = length(pyparams) function vjp_func(Δ) grad = paddle.fluid.dygraph.grad([res], [pyparams..., pyargs...], Δ, retain_graph=true) return (Tuple(grad[1:paramslen]),grad[paramslen+1:end]...) end return res, vjp_func end
which would give us the correct result. But things would be difficult when we trying to update the params with optimizers and in some cases, we need to separate params and model like examples in neuralPDE:
```julia
paddlewrap = PaddleFCNet(2, 1, 3, 16;dtype="float64", activation="sigmoid")
initθ, _ = Optimisers.destructure(paddlewrap)
discretization = PhysicsInformedNN(paddlewrap, StochasticTraining(100;bcs_points = 40), init_params =initθ)
It should be fine if we use DLPack.jl which python's tensor and julia's array will share some data, and any in-place changes in arrays will change the tensor too.
But it looks like the NeuralPDE will not directly change the params in the model, but change a new flatten array generated during training.
So the solution now is that every time before calling model(x)
, we will convert params to python's tensor which may create more cost.
In the futures, we could look more into Optimisers.jl and NeuralPDE.jl to create new function when updating the params of wrapper.
function (stateless_module::PaddleStatelessFCNet)(params::Vector, inputs; kwinputs...)
out = PyNULL()
copy!(out, inputs)
state = 1
for layer in stateless_module.layers
state = PaddleLayerForward!(out, params, state, layer)
end
return out
end
struct PaddleLinear<:PaddleStatelessLayer features_ins::Int features_outs::Int end
function PaddleLayerForward!(out::PyObject, params::Vector, state::Int, L::PaddleLinear) weight, state = iterate(params, state) bias, state = iterate(params, state) copy!(out, paddle.matmul(out, weight)) copy!(out, paddle.add(out, bias)) return state end
struct PaddleActivation<:PaddleStatelessLayer act::PyObject end
function PaddleLayerForward!(out::PyObject, params::Vector, state::Int, L::PaddleActivation) copy!(out, L.act(out)) return state end
But if we are going to build a more complex net, we need to write every forward function by hand.
So the solution now would just creata another type for general net, and copy the params to python's model before calling forward function
```julia
# a rough solution for General Net
struct PaddleStatelessGeneralNet<:PaddleStatelessModule
NN::PyObject
end
function (stateless_module::PaddleStatelessGeneralNet)(params::Vector, inputs; kwinputs...)
map((p,p_new)->p.set_value(p_new), stateless_module.NN.parameters(), params)
out = stateless_module.NN(inputs)
return out
end
QuadratureTraining()
, the training will become pretty slow. I'm still checking the reasons. QuadratureTraining()
uses a adaptive quadrature method which involves some integral algorithm, maybe it's using a bigger samples set. I'm not sure
some problems in the package