thank you for this amazing work! I am really excited to use the library, however, running into a slight problem because the actual dataset I have is about ~200 million rows large, and not possible to fit into a pandas dataframe. Currently, I am working with vaex. I am therefore wondering if there are any suggestions or ideas to support this. I would alse be open to work on this as a contribution . Thanks in advance!
Edit:
Don't know how I missed the info box for "Large Datasets". But seeing the suggestion there, maybe it would be helpful to write an example tutorial that would explain this. I will try the suggestion there to my use case and if there is interest, I could contribute a tutorial, or talk about other extensions for large datasets.
Hi,
thank you for this amazing work! I am really excited to use the library, however, running into a slight problem because the actual dataset I have is about ~200 million rows large, and not possible to fit into a pandas dataframe. Currently, I am working with vaex. I am therefore wondering if there are any suggestions or ideas to support this. I would alse be open to work on this as a contribution . Thanks in advance!
Edit: Don't know how I missed the info box for "Large Datasets". But seeing the suggestion there, maybe it would be helpful to write an example tutorial that would explain this. I will try the suggestion there to my use case and if there is interest, I could contribute a tutorial, or talk about other extensions for large datasets.