bolcom / serenade-experiments-sigmod

Apache License 2.0
2 stars 1 forks source link

Missing catalog file #2

Closed GresgentonG closed 2 years ago

GresgentonG commented 2 years ago

From both the datasets provided by you and the original retailrocket dataset from kaggle, we cannot find the details on how should we provide the catalog file (the catalog_input_dir argument of function create_serenade_indexes), which contain some information about ForSale and IsAdult. And these information also seems to be used later in the server, so it would be very helpful if you could give us some further detail about this, thanks!

bkersbergen commented 2 years ago

Hi @GresgentonG ,

I think this question is related to #1 .

We shared our code on how we create the serenade index offline for our production setup on our private data in create_serenade_indexes. For reproducing the experiments you can just use the CSV reader which works great for datasets up to ~200M rows.

GresgentonG commented 2 years ago

Thanks for the reply! Yes I think this issue is relevant to issue 1. I will post some of my follow-up questions there and close this issue.