theogf / AugmentedGaussianProcesses.jl

Gaussian Process package based on data augmentation, sparsity and natural gradients
https://theogf.github.io/AugmentedGaussianProcesses.jl/dev/
Other
134 stars 9 forks source link

Online as in dynamically adding new samples? #16

Open robertfeldt opened 5 years ago

robertfeldt commented 5 years ago

Wow, this is a very welcome package and I really look forward to using this.

You mention that in the future you plan to make this useable in an online setting. I assume you mean that new samples can be added dynamically, as the real/original function is being sampled, as more information is obtained? This would be a real killer feature imho. Any ideas on when you plan to work on this? And what kind of Ordo/complexity performance can be expected?

Thanks for the great work on this!

theogf commented 5 years ago

Thanks!

Online GP is my current research focus, which is in its early stage. I currently have an (incomplete) implementation of Thang Bui paper: Streaming sparse Gaussian process approximations (2017). It is working very nicely and has complexity O(m^3) with m being the number of inducing points. It should be working with data added dynamically (I just have to figure out how to do that in Julia, I am just faking a sequential stream atm). This should hopefully be working fine during this year if everything goes well.

My research is on trying to find a way to get m automatically given the data and the model choice. However I would focus on data which is stationary (in terms of domain), as time series can be more easily solved via state-space techniques : see Infinite-Horizon Gaussian Processes

robertfeldt commented 5 years ago

Ok, great, when online mode is available I'll try it on some of my research problems and connect it with my BlackBoxOptim.jl package (as a surrogate model in Bayesian Opt).

robertfeldt commented 4 years ago

Any progress or plan for this?

theogf commented 4 years ago

Hey! Sorry for delayed answer, online GP are now on top of priority list so you should be able to test a stable version soon. In the mean time if you want to play around (there is no example yet) you can play with the onlinegp branch

theogf commented 4 years ago

Hi, It's maybe a bit late, but the online model is now out and (partly) tested. There is an example in the notebooks on how to use it!

robertfeldt commented 4 years ago

Thanks, this looks great! It looks like a `using KernelFunctions' is missing in the notebook/example? And which package is needed for OIPS?

theogf commented 4 years ago

OIPS is included in AGP, and KernelFunctions should be reexported automatically by AGP. Let me double check the example

theogf commented 4 years ago

Ok I rechecked the example, there was one typo. Make sure that you have the latest version (0.7.0) of AGP and it should run smoothly

robertfeldt commented 4 years ago

Thanks, Theo! On a fresh Julia 1.4 with AugmentedGaussianProcesses v0.7.1 I still run into some problems with the online example code. For this line I get an error but it seems to only be in the show method so not a show-stopper:


julia> IP_alg = OIPS(0.8,nothing)
Error showing value of type OIPS{Float64,Array{Float64,2},Nothing}:
ERROR: UndefRefError: access to undefined reference
Stacktrace:
 [1] getproperty at ./Base.jl:33 [inlined]
 [2] size at /home/robertfeldt/.julia/packages/AugmentedGaussianProcesses/JK8MB/src/inducingpoints/inducing_points.jl:16 [inlined]
 [3] axes at ./abstractarray.jl:75 [inlined]
 [4] summary at ./show.jl:2134 [inlined]
 [5] show(::IOContext{REPL.Terminals.TTYTerminal}, ::MIME{Symbol("text/plain")}, ::OIPS{Float64,Array{Float64,2},Nothing}) at ./arrayshow.jl:317

but then when training in the for loop I get:

julia> for (X_batch,y_batch) in eachbatch((X_train,y_train), obsdim=1, size=10)
           train!(model,X_batch,y_batch,iterations=3)
       end
ERROR: Mutating arrays is not supported
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] (::Zygote.var"#1039#1040")(::Nothing) at /home/robertfeldt/.julia/packages/Zygote/KNUTW/src/lib/array.jl:59
 [3] (::Zygote.var"#2757#back#1041"{Zygote.var"#1039#1040"})(::Nothing) at /home/robertfeldt/.julia/packages/ZygoteRules/6nssF/src/adjoint.jl:49
 [4] copytri! at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/LinearAlgebra/src/matmul.jl:440 [inlined]
 [5] copytri! at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/LinearAlgebra/src/matmul.jl:436 [inlined]
...

here's my versioninfo():

julia> versioninfo()
Julia Version 1.4.0
Commit b8e9a9ecc6 (2020-03-21 16:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: AMD Ryzen 7 1700X Eight-Core Processor
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, znver1)
Environment:
  JULIA_NUM_THREADS = 4
  JULIA_CMDSTAN_HOME = /home/robertfeldt/lib/cmdstan

I'll try also on 1.3.1 and without threads and report back if any change. Will also try on your latest master version.

robertfeldt commented 4 years ago

Same problem on Julia 1.3.1 and without threads. On your master branch the error message is different:

julia> for (X_batch,y_batch) in eachbatch((X_train,y_train), obsdim=1, size=10)
           train!(model,X_batch,y_batch,iterations=3)
       end
ERROR: MethodError: no method matching natural_gradient!(::Array{Float64,1}, ::Array{Float64,1}, ::Float64, ::AugmentedGaussianProcesses.AVIOptimizer{Float64,Descent}, ::Array{Float64,2}, ::AugmentedGaussianProcesses._OSVGP{Float64})
Closest candidates are:
  natural_gradient!(::AbstractArray{T,1} where T, ::AbstractArray{T,1} where T, ::Real, ::AugmentedGaussianProcesses.AVIOptimizer, ::AbstractArray{T,2} where T, ::AugmentedGaussianProcesses._VGP{T}) where {T, L} at /home/robertfeldt/.julia/packages/AugmentedGaussianProcesses/rhnm9/src/inference/analyticVI.jl:184
  natural_gradient!(::AbstractArray{T,1}, ::AbstractArray{T,1}, ::Real, ::AugmentedGaussianProcesses.AVIOptimizer, ::AbstractArray{T,2} where T, ::AugmentedGaussianProcesses._SVGP{T}) where T<:Real at /home/robertfeldt/.julia/packages/AugmentedGaussianProcesses/rhnm9/src/inference/analyticVI.jl:197
  natural_gradient!(::AbstractArray{T,1}, ::AbstractArray{T,1}, ::AnalyticVI, ::AugmentedGaussianProcesses.AVIOptimizer, ::AbstractArray{T,2} where T, ::AugmentedGaussianProcesses._OSVGP{T}) where T at /home/robertfeldt/.julia/packages/AugmentedGaussianProcesses/rhnm9/src/inference/analyticVI.jl:231
Stacktrace:
 [1] _broadcast_getindex_evalf at ./broadcast.jl:631 [inlined]
 [2] _broadcast_getindex at ./broadcast.jl:604 [inlined]
 [3] getindex at ./broadcast.jl:564 [inlined]
theogf commented 4 years ago

Thanks for testing it! It definitely shows there is something wrong with my tests :sweat_smile: I corrected the problem and a new release should fix it soon. The threads should generally not be an issue.