vaexio / vaex

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
https://vaex.io
MIT License
8.28k stars 590 forks source link

vaex-distributed examples Unable to find? #655

Open fuhao009 opened 4 years ago

fuhao009 commented 4 years ago

Can you provide a use case for vaex-distributed?

JovanVeljanoski commented 4 years ago

I don't believe we have an published examples on this. It was more done as a proof of concept, in order for us to determine how to develop vaex in the future.

Perhaps @maartenbreddels can give you a very simple usecase.

Just curious: do you have a particular need for distributed computing, or do you just want to try things out?

kyprifog commented 4 years ago

I'm also curious what the philosophy for vaex-distributed is. All the examples I see use a single big machine, is that where vaex is focused mostly?

maartenbreddels commented 4 years ago

Hi,

we thought a lot about vaex-distributed. Our current position is that you probably don't need it (get a bigger instance/faster storage) and you are good to go. The extra support burden at this moment for open source usage is too big, so we develop this in vaex-enterprise (closed source). I can imagine we can have research and university friendly licenses in the future. Hope you understand our position, feedback welcome though!

Regards,

Maarten