Open songjhaha opened 2 years ago
it looks like the code cost much during converting inputs and outputs between tensor and julia's arrays with DLPack
"bs=32" => 2-element BenchmarkTools.BenchmarkGroup:
tags: []
"forward" => 3-element BenchmarkTools.BenchmarkGroup:
tags: []
"funcpaddle" => Trial(186.923 μs)
"jl" => Trial(454.193 μs)
"paddle" => Trial(171.578 μs)
"backward" => 3-element BenchmarkTools.BenchmarkGroup:
tags: []
"funcpaddle" => Trial(587.943 μs)
"jl" => Trial(1.110 ms)
"paddle" => Trial(601.688 μs)
where the "funcpaddle" case uses the PaddleStatelessModule
in https://github.com/songjhaha/PaddleChainRules.jl/blob/main/src/Net.jl#L19 to compute, and the inputs and outputs are both tensor.
In PyCallChainRules.jl, benchmark https://github.com/rejuvyesh/PyCallChainRules.jl/blob/main/benchmark/bench_pytorchmlp.jl gives similar results.
In the benchmark of NeuralPDE, both paddle and torch are much slower than Flux.
"cpu" => 3-element BenchmarkTools.BenchmarkGroup:
tags: []
"Flux" => Trial(36.633 ms)
"torch" => Trial(224.266 ms)
"paddle" => Trial(215.442 ms)
here we use a grid traning strategy GridTraining(0.1)
QuadratureTraining()
strategy works well but is quite slow with this package.
@findmyway I think more results or tests could be shown in benchmarks, any suggestions?
Could you set the second notebook as public? It seems I don't have necessary permission to open it.
Could you set the second notebook as public? It seems I don't have necessary permission to open it.
sorry for that. It's ok now
For the first one, I think overall it's pretty good.
It would be great if you can adjust the size of the network and make a chart to compare the results.
Also, I'd suggest you adding some short descriptions on how you've designed the benchmark and in the end of the notebook, draw some conclusions on your side. This would be much helpful for someone else to review it.
update some benchmarks in colab.
check benchmarks on colab Benchmark of NeuralPDE Benchmark of Forward and Backward