avik-pal / Wandb.jl

Unofficial Julia bindings for logging experiments to wandb.ai
https://avik-pal.github.io/Wandb.jl/stable/
MIT License
82 stars 11 forks source link

Segfault from basic example #27

Closed MilesCranmer closed 1 month ago

MilesCranmer commented 9 months ago

@avik-pal I've been trying out TensorBoardLogger.jl and Wandb.jl in https://github.com/MilesCranmer/SymbolicRegression.jl/pull/277, I find that:

  1. TensorBoardLogger.jl works, but
  2. Wandb.jl experiences a segfault.

Here's a MWE. First, install SymbolicRegression in the logger branch with:

julia -e 'using Pkg; pkg"add https://github.com/MilesCranmer/SymbolicRegression.jl#36015dec0ae1193615a98c92370278451dc23b92"'

Then, test it out with Wandb with:

using SymbolicRegression, Wandb, Logging, MLJBase

logger = WandbLogger(project="jl-tests")

X = rand(100, 2)
y = X[:, 1] + X[:, 2] .^ 2.5

model = SRRegressor(
    niterations=1000,
    binary_operators=[+, *, -, /],
    logger=logger
)
mach = machine(model, X, y)

fit!(mach)

which generates the segfault:

[51713] signal (11.2): Segmentation fault: 11--------------------------------
in expression starting at REPL[33]:1
dict_dealloc at /Users/mcranmer/.julia/environments/sr-tests/.CondaPkg/env/lib/libpython3.11.dylib (unknown line)
_PyFrame_Clear at /Users/mcranmer/.julia/environments/sr-tests/.CondaPkg/env/lib/libpython3.11.dylib (unknown line)
_PyEval_EvalFrameDefault at /Users/mcranmer/.julia/environments/sr-tests/.CondaPkg/env/lib/libpython3.11.dylib (unknown line)
_PyEval_Vector at /Users/mcranmer/.julia/environments/sr-tests/.CondaPkg/env/lib/libpython3.11.dylib (unknown line)
_PyVectorcall_Call at /Users/mcranmer/.julia/environments/sr-tests/.CondaPkg/env/lib/libpython3.11.dylib (unknown line)
_PyEval_EvalFrameDefault at /Users/mcranmer/.julia/environments/sr-tests/.CondaPkg/env/lib/libpython3.11.dylib (unknown line)
_PyEval_Vector at /Users/mcranmer/.julia/environments/sr-tests/.CondaPkg/env/lib/libpython3.11.dylib (unknown line)
_PyVectorcall_Call at /Users/mcranmer/.julia/environments/sr-tests/.CondaPkg/env/lib/libpython3.11.dylib (unknown line)
_PyEval_EvalFrameDefault at /Users/mcranmer/.julia/environments/sr-tests/.CondaPkg/env/lib/libpython3.11.dylib (unknown line)
_PyEval_Vector at /Users/mcranmer/.julia/environments/sr-tests/.CondaPkg/env/lib/libpython3.11.dylib (unknown line)
_PyVectorcall_Call at /Users/mcranmer/.julia/environments/sr-tests/.CondaPkg/env/lib/libpython3.11.dylib (unknown line)
_PyEval_EvalFrameDefault at /Users/mcranmer/.julia/environments/sr-tests/.CondaPkg/env/lib/libpython3.11.dylib (unknown line)
_PyEval_Vector at /Users/mcranmer/.julia/environments/sr-tests/.CondaPkg/env/lib/libpython3.11.dylib (unknown line)
method_vectorcall at /Users/mcranmer/.julia/environments/sr-tests/.CondaPkg/env/lib/libpython3.11.dylib (unknown line)
_PyVectorcall_Call at /Users/mcranmer/.julia/environments/sr-tests/.CondaPkg/env/lib/libpython3.11.dylib (unknown line)
PyObject_Call at /Users/mcranmer/.julia/packages/PythonCall/wXfah/src/cpython/pointers.jl:299 [inlined]
macro expansion at /Users/mcranmer/.julia/packages/PythonCall/wXfah/src/Py.jl:131 [inlined]
pycallargs at /Users/mcranmer/.julia/packages/PythonCall/wXfah/src/abstract/object.jl:211 [inlined]
#pycall#59 at /Users/mcranmer/.julia/packages/PythonCall/wXfah/src/abstract/object.jl:222
pycall at /Users/mcranmer/.julia/packages/PythonCall/wXfah/src/abstract/object.jl:218 [inlined]
#_#11 at /Users/mcranmer/.julia/packages/PythonCall/wXfah/src/Py.jl:341 [inlined]
Py at /Users/mcranmer/.julia/packages/PythonCall/wXfah/src/Py.jl:341 [inlined]
#log#5 at /Users/mcranmer/.julia/packages/Wandb/nfzlz/src/main.jl:69
log at /Users/mcranmer/.julia/packages/Wandb/nfzlz/src/main.jl:69 [inlined]
process at /Users/mcranmer/.julia/packages/Wandb/nfzlz/src/corelogging.jl:50
unknown function (ip: 0x2c863c0ff)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:3076
#handle_message#40 at /Users/mcranmer/.julia/packages/Wandb/nfzlz/src/corelogging.jl:70
handle_message at /Users/mcranmer/.julia/packages/Wandb/nfzlz/src/corelogging.jl:53
unknown function (ip: 0x2bcec00df)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:3076
jl_apply at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/./julia.h:1982 [inlined]
jl_f__call_latest at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/builtins.c:812
#invokelatest#2 at ./essentials.jl:889 [inlined]
invokelatest at ./essentials.jl:884 [inlined]
macro expansion at ./logging.jl:365 [inlined]
#36 at /Users/mcranmer/PermaDocuments/SymbolicRegression.jl/src/SearchUtils.jl:439
unknown function (ip: 0x2a57c917f)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:3076
with_logstate at ./logging.jl:515
with_logger at ./logging.jl:627 [inlined]
#default_logging_callback#35 at /Users/mcranmer/PermaDocuments/SymbolicRegression.jl/src/SearchUtils.jl:414 [inlined]
default_logging_callback at /Users/mcranmer/PermaDocuments/SymbolicRegression.jl/src/SearchUtils.jl:412 [inlined]
#29#30 at /Users/mcranmer/PermaDocuments/SymbolicRegression.jl/src/SymbolicRegression.jl:561 [inlined]
#29 at /Users/mcranmer/PermaDocuments/SymbolicRegression.jl/src/SymbolicRegression.jl:561 [inlined]
_equation_search at /Users/mcranmer/PermaDocuments/SymbolicRegression.jl/src/SymbolicRegression.jl:982
unknown function (ip: 0x28cb12623)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:3076
#equation_search#28 at /Users/mcranmer/PermaDocuments/SymbolicRegression.jl/src/SymbolicRegression.jl:567
equation_search at /Users/mcranmer/PermaDocuments/SymbolicRegression.jl/src/SymbolicRegression.jl:433
unknown function (ip: 0x28c784093)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:3076
#equation_search#24 at /Users/mcranmer/PermaDocuments/SymbolicRegression.jl/src/SymbolicRegression.jl:393
unknown function (ip: 0x2826c465f)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:3076
equation_search at /Users/mcranmer/PermaDocuments/SymbolicRegression.jl/src/SymbolicRegression.jl:338
#equation_search#26 at /Users/mcranmer/PermaDocuments/SymbolicRegression.jl/src/SymbolicRegression.jl:426 [inlined]
equation_search at /Users/mcranmer/PermaDocuments/SymbolicRegression.jl/src/SymbolicRegression.jl:423
unknown function (ip: 0x2826a4437)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:3076
_update at /Users/mcranmer/PermaDocuments/SymbolicRegression.jl/src/MLJInterface.jl:162
unknown function (ip: 0x28c324a37)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:3076
update at /Users/mcranmer/PermaDocuments/SymbolicRegression.jl/src/MLJInterface.jl:129
fit at /Users/mcranmer/PermaDocuments/SymbolicRegression.jl/src/MLJInterface.jl:123 [inlined]
fit at /Users/mcranmer/PermaDocuments/SymbolicRegression.jl/src/MLJInterface.jl:123
unknown function (ip: 0x28b58406b)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:3076
jl_apply at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/./julia.h:1982 [inlined]
do_apply at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/builtins.c:768
#fit_only!#57 at /Users/mcranmer/.julia/packages/MLJBase/ByFwA/src/machines.jl:680
fit_only! at /Users/mcranmer/.julia/packages/MLJBase/ByFwA/src/machines.jl:606 [inlined]
#fit!#63 at /Users/mcranmer/.julia/packages/MLJBase/ByFwA/src/machines.jl:778 [inlined]
fit! at /Users/mcranmer/.julia/packages/MLJBase/ByFwA/src/machines.jl:775
unknown function (ip: 0x282fbc123)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:3076
jl_apply at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/./julia.h:1982 [inlined]
do_call at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/interpreter.c:126
eval_body at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/interpreter.c:0
jl_interpret_toplevel_thunk at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/interpreter.c:775
jl_toplevel_eval_flex at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/toplevel.c:934
jl_toplevel_eval_flex at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/toplevel.c:877
jl_toplevel_eval_flex at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/toplevel.c:877
jl_toplevel_eval_flex at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/toplevel.c:877
ijl_toplevel_eval at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/toplevel.c:943 [inlined]
ijl_toplevel_eval_in at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/toplevel.c:985
eval at ./boot.jl:385 [inlined]
eval_user_input at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:150
repl_backend_loop at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:246
#start_repl_backend#46 at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:231
start_repl_backend at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:228
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:3076
#run_repl#59 at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:389
run_repl at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/REPL/src/REPL.jl:375
jfptr_run_repl_91817 at /Users/mcranmer/.julia/juliaup/julia-1.10.0+0.aarch64.apple.darwin14/lib/julia/sys.dylib (unknown line)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:3076
#1013 at ./client.jl:432
jfptr_YY.1013_82805 at /Users/mcranmer/.julia/juliaup/julia-1.10.0+0.aarch64.apple.darwin14/lib/julia/sys.dylib (unknown line)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:3076
jl_apply at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/./julia.h:1982 [inlined]
jl_f__call_latest at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/builtins.c:812
#invokelatest#2 at ./essentials.jl:887 [inlined]
invokelatest at ./essentials.jl:884 [inlined]
run_main_repl at ./client.jl:416
exec_options at ./client.jl:333
_start at ./client.jl:552
jfptr__start_82831 at /Users/mcranmer/.julia/juliaup/julia-1.10.0+0.aarch64.apple.darwin14/lib/julia/sys.dylib (unknown line)
_jl_invoke at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/gf.c:3076
jl_apply at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/./julia.h:1982 [inlined]
true_main at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/jlapi.c:582
jl_repl_entrypoint at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-HL2F7YQ3XH.0/build/default-honeycrisp-HL2F7YQ3XH-0/julialang/julia-release-1-dot-10/src/jlapi.c:731
Allocations: 81314433 (Pool: 81278727; Big: 35706); GC: 74
[1]    51712 segmentation fault  julia --project=@sr-tests

You can verify that running it with TensorBoardLogger does not produce any issues. So I'm not sure what's going wrong here...

Here's my system info:

julia> versioninfo()
Julia Version 1.10.0
Commit 3120989f39b (2023-12-25 18:01 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  CPU: 8 × Apple M1 Pro
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, apple-m1)
  Threads: 8 on 6 virtual cores
Environment:
  JULIA_NUM_THREADS = auto
  JULIA_FORMATTER_SO = /Users/mcranmer/julia_formatter.so
  JULIA_EDITOR = code

but hopefully you are able to reproduce.


For the record I also see this on v0.4.4 with PyCall.jl

lassepe commented 3 months ago

I'm seeing the same thing. My feeling is that this effect is amplified when logging at high frequency. When I throttle the logging sufficiently, the problem seems to disappear.

avik-pal commented 3 months ago

That would be consistent with the example Miles shared. I am guessing SR.jl would be logging at quite a high rate. That said, I am completely clueless as to how to solve it. Maybe someone from the Python-Julia Interop could help out here.

I will post this on slack later today to seek out help (feel free to post it there, I don't have slack access for the next hrs)

MilesCranmer commented 3 months ago

@avik-pal I wonder if it could be an interaction with Python multiprocessing? Is there a way to prevent wandb from spawning a separate process to communicate with wandb?

avik-pal commented 3 months ago

what happens if you pass in settings=wandb.Settings(; start_method="thread") to init?

MilesCranmer commented 3 months ago

Hm it seems like my original example doesn't start. Is Wandb.jl incompatible with the latest Python?

julia> using SymbolicRegression, Wandb, Logging, MLJBase
ERROR: InitError: Python: ModuleNotFoundError: No module named 'distutils'
Python stacktrace:
 [1] <module>
   @ /private/var/folders/1h/xyppkvx52cl6w3_h8bw_gdqh0000gr/T/tmp.GdDLQLXHGq/.CondaPkg/env/lib/python3.12/site-packages/wandb/env.py:16
 [2] <module>
   @ /private/var/folders/1h/xyppkvx52cl6w3_h8bw_gdqh0000gr/T/tmp.GdDLQLXHGq/.CondaPkg/env/lib/python3.12/site-packages/wandb/util.py:57
 [3] <module>
   @ /private/var/folders/1h/xyppkvx52cl6w3_h8bw_gdqh0000gr/T/tmp.GdDLQLXHGq/.CondaPkg/env/lib/python3.12/site-packages/wandb/sdk/lib/config_util.py:10
 [4] <module>
   @ /private/var/folders/1h/xyppkvx52cl6w3_h8bw_gdqh0000gr/T/tmp.GdDLQLXHGq/.CondaPkg/env/lib/python3.12/site-packages/wandb/sdk/wandb_helper.py:6
 [5] <module>
   @ /private/var/folders/1h/xyppkvx52cl6w3_h8bw_gdqh0000gr/T/tmp.GdDLQLXHGq/.CondaPkg/env/lib/python3.12/site-packages/wandb/sdk/__init__.py:24
 [6] <module>
   @ /private/var/folders/1h/xyppkvx52cl6w3_h8bw_gdqh0000gr/T/tmp.GdDLQLXHGq/.CondaPkg/env/lib/python3.12/site-packages/wandb/__init__.py:27
Stacktrace:
  [1] pythrow()
    @ PythonCall.Core ~/.julia/packages/PythonCall/S5MOg/src/Core/err.jl:92
  [2] errcheck
    @ ~/.julia/packages/PythonCall/S5MOg/src/Core/err.jl:10 [inlined]
  [3] pyimport(m::String)
    @ PythonCall.Core ~/.julia/packages/PythonCall/S5MOg/src/Core/builtins.jl:1444
avik-pal commented 3 months ago

Seems to work for me at least in the beginning but then segfaults

https://wandb.ai/avikpal/jl-tests?nw=nwuseravikpal

avik-pal commented 3 months ago

what happens if you pass in settings=wandb.Settings(; start_method="thread") to init?

doesn't seem to work

avik-pal commented 3 months ago

Seems like the conda versions weren't updated for whatever reason. I have updated the code to install the pip version which is the latest one. @MilesCranmer can you check if https://github.com/avik-pal/Wandb.jl/pull/38 fixes the installation issue you had?

tylerjthomas9 commented 3 months ago

It looks like an issue with PythonCall.jl being called from threads other than the first Julia thread (https://juliapy.github.io/PythonCall.jl/stable/faq/#Is-PythonCall/JuliaCall-thread-safe?). The example at the top works for me when I use Julia with one thread and segfaults with 2+ threads. The following example works with one Julia thread and segfaults with 2+ threads

using Wandb, Logging

# Initialize the project
lg = WandbLogger(; project = "Wandb.jl", name = nothing)

# Set logger globally / in scope / in combination with other loggers
global_logger(lg)

# Logging Values
function log_wandb()
    Wandb.log(lg, Dict("accuracy" => 0.9, "loss" => 0.3))
end
Threads.@threads for i in 1:1000
    log_wandb()
end

I found a potential solution here to ensure that Python functions are called from the main thread.

using Wandb, Logging, ThreadPools

# Initialize the project
lg = WandbLogger(; project = "Wandb.jl", name = nothing)

# Set logger globally / in scope / in combination with other loggers
global_logger(lg)

# Logging Values
macro pythread(expr)
    quote
        fetch(@tspawnat 1 begin
            $(esc(expr))
        end)
    end
end

function log_wandb()
    @pythread begin
        Wandb.log(lg, Dict("accuracy" => 0.9, "loss" => 0.3))
    end
end

Threads.@threads for i in 1:1000
    log_wandb()
end
lassepe commented 3 months ago

Multi-threaded calls to PythonCall are documented not be allowed. However, this problem is not limited to multi-threading on the julia side. The cases where I have seen this happen and the original example above do not feature multiple Julia threads.

lassepe commented 2 months ago

Did anyone try https://github.com/JuliaPy/PythonCall.jl/pull/520 on this?

lassepe commented 2 months ago

The segfaults indeed seem to be all fixed with https://github.com/JuliaPy/PythonCall.jl/pull/520 :tada:

MilesCranmer commented 2 months ago

That's great!

trahflow commented 2 months ago

Did anyone try https://github.com/JuliaPy/PythonCall.jl/pull/520 on this?

Yes that seems to fix it! :tada:

avik-pal commented 2 months ago

Awesome! I will bump the compat for PythonCall once that PR lands.