BelstrelGit / OtusDataDriven

1 stars 1 forks source link

do you know code example to run criteo-1tb-benchmark fully locally , without spark? kind of online learning? for https://labs.criteo.com/2013/12/download-terabyte-click-logs-2/ #1

Open Sandy4321 opened 4 years ago

Sandy4321 commented 4 years ago

do you know code example to run criteo-1tb-benchmark fully locally , without spark? kind of online learning? for https://labs.criteo.com/2013/12/download-terabyte-click-logs-2/

BelstrelGit commented 4 years ago

Hi Sandy, no, I don't work without spark. What you mean "locally"? spark also work standalone mode at local machine, without cluster.

Sandy4321 commented 4 years ago

Yes it is the question How to process this data without spark at all Spark has advantages for many computers But for processing on one local computer it is useless and even harmful

Sandy4321 commented 4 years ago

https://stackoverflow.com/questions/38079853/how-can-i-implement-incremental-training-for-xgboost

Sandy4321 commented 4 years ago

Incremental learning is the golden key in this situation

BelstrelGit commented 4 years ago

Sorry, i use spark for DE job, how work with big data without spark don't now( With AI model I will start work at next mounth and it will simple work . Now I nub at this question)

BelstrelGit commented 4 years ago

If data not big you can simple work at simple program at scala or python)