chengchingwen / Transformers.jl

Julia Implementation of Transformer models
MIT License
523 stars 74 forks source link

Segfault in embedding function #73

Closed austinbean closed 3 years ago

austinbean commented 3 years ago

I'm following the example from the docs here, but I almost always get a segfault when I try to evaluate the embedding function. The dimension of the embedding seems to make the problem worse (ex: if I set value below to 512, it always segfaults. When it's 256, maybe 85% of the time.) This is on Julia 1.6.2. I would appreciate any suggestions!

MWE:

using Transformers using Transformers.Basic using Transformers.Pretrain

bert_model, wordpiece, tokenizer = pretrain"bert-uncased_L-12_H-768_A-12" vocab = Vocabulary(wordpiece) value = 256 pe = PositionEmbedding(value)

embed = Embed(value, 100)

function embedding(x) we = embed(x, inv(sqrt(512))) e = we .+ pe(we) return e end

v = [8739 2008] embedding(v)

The error I get follows, if it's of any help:

signal (11): Segmentation fault: 11 in expression starting at /Users/austinbean/Desktop/programs/emr_nlp/seg_mwe.jl:24 getindex at ./array.jl:802 [inlined] macro expansion at ./multidimensional.jl:860 [inlined] macro expansion at ./cartesian.jl:64 [inlined] _unsafe_getindex! at ./multidimensional.jl:855 [inlined] _unsafe_getindex at ./multidimensional.jl:846 _getindex at ./multidimensional.jl:832 [inlined] getindex at ./abstractarray.jl:1170 [inlined] macro expansion at /Users/austinbean/.julia/packages/Transformers/3YgSd/src/basic/embeds/gather.jl:18 [inlined]

272#threadsfor_fun at ./threadingconstructs.jl:81

272#threadsfor_fun at ./threadingconstructs.jl:48

unknown function (ip: 0x113fa4c8c) jl_apply_generic at /Applications/Julia-1.6.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.6.dylib (unknown line) start_task at /Applications/Julia-1.6.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.6.dylib (unknown line) Allocations: 43744691 (Pool: 43733847; Big: 10844); GC: 31 zsh: segmentation fault julia

chengchingwen commented 3 years ago

It's because v = [8739 2008] is exceeding the the vocabulary size, which is 100 in embed = Embed(value, 100).

austinbean commented 3 years ago

Thanks...

That looks like it solved the problem. I had it set to the size of the vocab in my problem, not the size of the vocab in the model I was loading.