Closed GresgentonG closed 2 years ago
Hi @GresgentonG ,
I think this question is related to #1 .
We shared our code on how we create the serenade index offline for our production setup on our private data in create_serenade_indexes
.
For reproducing the experiments you can just use the CSV reader which works great for datasets up to ~200M rows.
Thanks for the reply! Yes I think this issue is relevant to issue 1. I will post some of my follow-up questions there and close this issue.
From both the datasets provided by you and the original retailrocket dataset from kaggle, we cannot find the details on how should we provide the catalog file (the
catalog_input_dir
argument of functioncreate_serenade_indexes
), which contain some information aboutForSale
andIsAdult
. And these information also seems to be used later in the server, so it would be very helpful if you could give us some further detail about this, thanks!