Will it be faster when the multiple processes is run in GPU?

jeffgortmaker / pyblp

BLP Demand Estimation with Python

https://pyblp.readthedocs.io

MIT License

240 stars 83 forks source link

Will it be faster when the multiple processes is run in GPU? #87

Closed jansonleeljs closed 3 years ago

jansonleeljs commented 3 years ago

I read the codes in basics.py and problem.py, and understand that the market-by-market computations are distributed into multiple threads by multiprocessing.pool and the function generate_items. I'm wondering whether it would be faster if the market-by-market computation works are allocated to GPU? I really appreciate it if you can offer me some suggestions.

chrisconlon commented 3 years ago

It is safe to say that right now the market by market computations (solving the contraction mapping, etc.) are all parallelized, but nothing has been optimized or tested on the GPU.

In my experience, on large-scale real world examples, PyBLP does a pretty good job keeping all of your cores occupied. It helps that NumPy front-ends the BLAS which also knows how to use multiple threads for things like matrix-matrix multiplication.

One particular disadvantage for GPUs is that they generally perform single precision calculations which can be problematic if shares get really small, etc.

jeffgortmaker commented 3 years ago

Yeah, we don't have any explicit GPU support. I'm going to close this for now because it seems like we've answered your question, but feel free to re-open / keep commenting / etc if you have other questions or want some pointers towards BLP-type estimation with GPUs.

I've had success with JAX, which to Chris's point needs to be manually configured to use double precision. The downsides are (1) there tend to be sharp edges in JAX-like libraries that make them a bit more difficult to use, and (2) you'd have to re-code / test the estimation routine yourself.