Closed ValeriiBaidin closed 4 years ago
Is it possible to use Int32 for docs.terms It will help to save memory.
p.s. I think even int16 is enough. =))
I am sorry to bother you again. Thank you so much
I did it by myself. I hope it helps to save memory.
Would you help, how to change doc.count to Int16 I will help to save 75% space.
Where I have to change code. I have a problem with buffer.
in keyword argument hostbuf, expected Union{Nothing, Array{Int64,N} where N}, got Array{Int16,1}
Thank you so much
Hi Valerii,
So for the gpuLDA model, if you're changing both doc.terms
and doc.counts
to Int16
, then you need to change lines 388 and 390 in modelutils.jl to,
model.terms_buffer = cl.Buffer(Int16, model.context, (:r, :copy), hostbuf=terms)
model.counts_buffer = cl.Buffer(Int16, model.context, (:r, :copy), hostbuf=counts)
You also then need to change lines 31 and 33 in gpuLDA.jl to,
terms_buffer::cl.Buffer{Int16}
counts_buffer::cl.Buffer{Int16}
And then also lines 282 and 312 in gpuLDA.jl to,
const global short *counts,
const global short *terms,
I think this is everything, but I haven't tested it out myself, so I can't guarantee that it will work.
On a more general note,
I may look into defining Int32
and Int16
constructors for the Document and Corpus types. However I'll need to think some about how exactly this will work and any potential side effects before I actually make the changes.
I've done it for my uses right now (GPU)
Terms - Int32 counts - Int16
It saves memory and I hope also increase speed.
Is it possible to use Int32 for docs.terms It will help to save memory.
p.s. I think even int16 is enough. =))
I am sorry to bother you again. Thank you so much