Open Vimiso opened 4 weeks ago
The token dictionary takes up most of the allocated memory. We need to keep the entire dictionary in memory so that encoding text into tokens and vice versa is efficient. Currently, the built-in array
type is used for this. I have no idea how to reduce the amount of memory consumed in this place.
Take the given test:
26mb seems a bit much no? Especially considering the cached vocab is only
3.6mb
.