Version comprison between bootleg 1.0.0 and bootleg 1.1.0

changranelk commented 3 years ago

Dear author, First of all, thanks for developing and open sourcing this amazing tool!

I have a quick question about the difference between bootleg 1.0.0 and 1.1.0. Is bootleg 1.1.0 strictly better than 1.0.0, in that it can at least achieve the same accuracy but also have better performance? If there could be some concrete numbers that would be really good.

Also, in the change log, I noticed that:

bootleg 1.0.0 You will need at least 130 GB of disk space, 12 GB of GPU memory, and 40 GB of CPU memory to run our model.
bootleg 1.1.0 You will need at least XXX GB of disk space, 12 GB of GPU memory, and XXX GB of CPU memory to run our model.

Could you provide the number for XXX, or it is not yet measured? Also, I was wondering why bootleg 1.1.0 has the same GPU memory requirement with bootleg 1.0.0? (From my understanding, the huge entity embedding matrix doesn't need to be stored in the memory) I guess maybe it is because in bootleg 1.1.0, the entity embedding generated by the bert based entity encoder, is also stored in the memory?

Any comments and feedbacks are appreciated!

lorr1 commented 3 years ago

Hey.

Thanks for the questions. The bootleg 1.1 model is still alpha and being tested which is why we haven't fully updated our documentation yet. The new version should be the same quality (or better) than the old version in terms of accuracy. In terms of performance, the memory requirements are lower, and it is easier to use. The training time is slower. I'm still iterating on inference time to see how fast we can make it.

The estimated values for the XXX are

You will need at least 40GB of disk space, 8 GB of GPU memory, and 35GB of CPU memory.

The memory requirement on the GPU is lower as you do not need that massive entity embedding matrices. I've gotten as low as around 2GB in inference model with an eval batch size of 1. The model itself does not require a lot of space, but the auxiliary entity metadata does. If anything is too much for you, let me know, and I can try to decrease the usage more.

changranelk commented 3 years ago

Hi, thanks for the quick response! Many thanks, the answer helped a lot! I just have a quick follow up: for bootleg 1.1.0, if we have enough memory storage, is it better to store the entity embedding matrix generated by the bert encoder in advance? In this case, we don't need to generate the candidate entity embeddings during inference time. and we will have an large entity embedding matrix which is similar to what bootleg 1.0.0 did.

The reason I am asking this is that, my system has sufficient CPU and GPU memory, and I just want to figure out how to improve the accuracy and performance assuming memory is not a limitation for bootleg.

Excited for the bootleg 1.1 model to be released!

lorr1 commented 3 years ago

Sorry for the delay on this.

Yes, that could save inference time as you only have to do a forward pass through the context encoder rather than the entity encoder, too. I have support for generating this embedding matrix here.

My plan is to add support in the annotator to provide a saved entity embedding matrix in the future. But it is totally doable.

changranelk commented 3 years ago

Thanks for the input! Btw, to measure the performance, we created a random embedding matrix and cached it in memory. Compare to generating the embedding on the fly, the caching operation boost the end to end throughput around 30x.

I will close the issue now, thanks again for your detailed answer!

HazyResearch / bootleg

Version comprison between bootleg 1.0.0 and bootleg 1.1.0 #70