luminousmen / luminousmen.com

2 stars 0 forks source link

https://luminousmen.com/post/big-data-file-formats?utterances=1ffcb39dc9f72868b53b6a80OPkqHaVyAExuaP%2BcQgdS2LeTD8URA2SwoWUZCrcZqdwdvvqPPmBxGibScUxM%2FrMjvnpaYioRr2yOdIE0NnRXUeXIJdFZoAuZwR%2F5E8Dh5sCSm%2B%2BEROzokPstm0E%3D #63

Open utterances-bot opened 1 year ago

utterances-bot commented 1 year ago

Big Data file formats - Blog | luminousmen

Which one data format do you pick for your next Big Data project: CSV, JSON, Parquet and Avro?

https://luminousmen.com/post/big-data-file-formats?utterances=1ffcb39dc9f72868b53b6a80OPkqHaVyAExuaP%2BcQgdS2LeTD8URA2SwoWUZCrcZqdwdvvqPPmBxGibScUxM%2FrMjvnpaYioRr2yOdIE0NnRXUeXIJdFZoAuZwR%2F5E8Dh5sCSm%2B%2BEROzokPstm0E%3D

goldenpine commented 1 year ago

Hi, this is an awesome comparison of those formats! I wanted to run those tests. One question on the files used, netflix.json, netflix.cvs, netflix.avro, and netflix.parquet. The original Netflix Prize Data Set seems containing only thousands of small cvs files. May I know If you can share those consolidated files in those 4 different formats?