JuliaText / TextAnalysis.jl

Julia package for text analysis
Other
374 stars 96 forks source link

using TextAnalysis suddenly takes very long #103

Closed Paethon closed 6 years ago

Paethon commented 6 years ago

Hi

I have noticed that using TextAnalysis suddenly takes very long (and not just the first time for precompilation)

It hangs there without doing anything for about a minute and then says loaded. Any idea what could be causing this? Previously I did not have this problem.

zgornel commented 6 years ago

It may be sentiment.jl. Some benchmarks (TextAnalysis precompiled) Without it:

julia> @time using TextAnalysis
  6.319135 seconds (14.38 M allocations: 744.554 MiB, 4.81% gc time)

with:

julia> @time using TextAnalysis                                                                                                                                                
loaded                                                                                                                                                                         
 12.262849 seconds (25.94 M allocations: 1.325 GiB, 5.45% gc time)    
Paethon commented 6 years ago

Is there a way to initialize the sentiment analysis stuff only when it is actually used? Seems to be a pretty heavy impact for something presumably only a fraction of people will use?

zgornel commented 6 years ago

Requires.jl could be used to execute include("sentiment.jl") only when Flux.jl, BSON.jl etc are already loaded (not sure if it would work) Alternatively, a simpler approach would be to fork, comment out the line that includes sentiment.jl and use the personal fork from then on, rebasing on upstream:master whenever the case. It is still strange that such a computational penalty is incurred by including sentiment analysis, maybe someone will have a good explanation (and solution) to the issue...

Paethon commented 6 years ago

Sure, for me personally I did just that, but I would think that the much longer load time is not what most other people want.

I am also not sure where this delay happens (Where is the loaded print coming from?). To me it seems like sentiment.jl is already loading the whole model when TextAnalysis is being imported (which I assume is not what was intended?)

zgornel commented 6 years ago

Maybe the model gets loaded at parse time (macro?) as no calls are made to load the model ... this is indeed strange.

Paethon commented 6 years ago

OK, I checked it further. The problem really is simply the using Flux