asu-cactus / netsdb

A system that seamlessly integrates Big Data processing and machine learning model serving in distributed relational database
Apache License 2.0
15 stars 5 forks source link

42 support large scale decision forest decision in netsdb #49

Closed jiazou-bigdata closed 2 years ago

jiazou-bigdata commented 2 years ago

@hguan6 Please run the decision forest (random forest and XGBoost) netsDB following the below steps:

//build the code ubuntu@ip-172-31-16-75:~/netsdb$ scons libDFTest

//train models and convert models for netsDB cd ~/netsdb/model-inference/decisionTree/experiments run data_processing.py, train_model.py and convert_trained_model_to_framework.py

//cleanup system ubuntu@ip-172-31-16-75:~/netsdb$ scripts/cleanupNode.sh

//start a pseudo cluster in one node with 8 threads and 30GB memory (I used R4.2xlarge instance) ubuntu@ip-172-31-16-75:~/netsdb$ ./scripts/startPseudoCluster.py 8 30000

//load the data from Higgs test CSV file ubuntu@ip-172-31-16-75:~/netsdb$ bin/testDecisionForest Y 2200000 28 275000 F 32 model-inference/decisionTree/experiments/HIGGS.csv_test.csv

//run inferences ubuntu@ip-172-31-16-75:~/netsdb$ bin/testDecisionForest N 2200000 28 275000 F 32 model-inference/decisionTree/experiments/HIGGS.csv_test.csv model-inference/decisionTree/experiments/models/higgs_xgboost_500_8_netsdb XGBoost

@venkate5hgunda Please also take a look as you may need to implement LightGBM on netsDB.

hguan6 commented 2 years ago

I ran "scons libDFTest" to build, but got an error message. The last couple of lines are:

Compiling on Linux

Platform: Linux-5.15.0-1015-aws-x86_64-with-glibc2.29

System: Linux

Release: 5.15.0-1015-aws

Version: #19~20.04.1-Ubuntu SMP Wed Jun 22 19:07:51 UTC 2022

scons: done reading SConscript files.

scons: Building targets ...

scons: *** Do not know how to make File target `libDFTest' (/home/ubuntu/netsdb/libDFTest). Stop.

scons: building terminated because of errors.

On Sun, Aug 28, 2022 at 11:01 PM Jia Zou @.***> wrote:

@jiazou-bigdata https://urldefense.com/v3/__https://github.com/jiazou-bigdata__;!!IKRxdwAv5BmarQ!bDkQOH7Z--bnY6hj8L1lDd5yiuUSgU7qhGOfIerWmbASykWoFoxXNAyhn1AYFEstd4uiy7Ix-6-YoBNylH3tXNY$ requested your review on: #49 https://urldefense.com/v3/__https://github.com/asu-cactus/netsdb/pull/49__;!!IKRxdwAv5BmarQ!bDkQOH7Z--bnY6hj8L1lDd5yiuUSgU7qhGOfIerWmbASykWoFoxXNAyhn1AYFEstd4uiy7Ix-6-YoBNyeRhq8h8$ 42 support large scale decision forest decision in netsdb.

— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/asu-cactus/netsdb/pull/49*event-7275425759__;Iw!!IKRxdwAv5BmarQ!bDkQOH7Z--bnY6hj8L1lDd5yiuUSgU7qhGOfIerWmbASykWoFoxXNAyhn1AYFEstd4uiy7Ix-6-YoBNyVV0jMko$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AEX53CZVRNDRQN7654HWVEDV3RG4VANCNFSM574PHS6A__;!!IKRxdwAv5BmarQ!bDkQOH7Z--bnY6hj8L1lDd5yiuUSgU7qhGOfIerWmbASykWoFoxXNAyhn1AYFEstd4uiy7Ix-6-YoBNyJblgqKo$ . You are receiving this because your review was requested.Message ID: @.***>

--

Hong Guan, MS

School of Computing and Augmented Intelligence

Arizona State University

jiazou-bigdata commented 2 years ago

I ran "scons libDFTest" to build, but got an error message. The last couple of lines are: Compiling on Linux Platform: Linux-5.15.0-1015-aws-x86_64-with-glibc2.29 System: Linux Release: 5.15.0-1015-aws Version: #19~20.04.1-Ubuntu SMP Wed Jun 22 19:07:51 UTC 2022 scons: done reading SConscript files. scons: Building targets ... scons: Do not know how to make File target `libDFTest' (/home/ubuntu/netsdb/libDFTest). Stop. scons: building terminated because of errors. On Sun, Aug 28, 2022 at 11:01 PM Jia Zou @.> wrote: @jiazou-bigdata <https://urldefense.com/v3/https://github.com/jiazou-bigdata;!!IKRxdwAv5BmarQ!bDkQOH7Z--bnY6hj8L1lDd5yiuUSgU7qhGOfIerWmbASykWoFoxXNAyhn1AYFEstd4uiy7Ix-6-YoBNylH3tXNY$> requested your review on: #49 <https://urldefense.com/v3/https://github.com/asu-cactus/netsdb/pull/49;!!IKRxdwAv5BmarQ!bDkQOH7Z--bnY6hj8L1lDd5yiuUSgU7qhGOfIerWmbASykWoFoxXNAyhn1AYFEstd4uiy7Ix-6-YoBNyeRhq8h8$> 42 support large scale decision forest decision in netsdb. — Reply to this email directly, view it on GitHub <https://urldefense.com/v3/https://github.com/asu-cactus/netsdb/pull/49*event-7275425759;Iw!!IKRxdwAv5BmarQ!bDkQOH7Z--bnY6hj8L1lDd5yiuUSgU7qhGOfIerWmbASykWoFoxXNAyhn1AYFEstd4uiy7Ix-6-YoBNyVV0jMko$>, or unsubscribe <https://urldefense.com/v3/https://github.com/notifications/unsubscribe-auth/AEX53CZVRNDRQN7654HWVEDV3RG4VANCNFSM574PHS6A;!!IKRxdwAv5BmarQ!bDkQOH7Z--bnY6hj8L1lDd5yiuUSgU7qhGOfIerWmbASykWoFoxXNAyhn1AYFEstd4uiy7Ix-6-YoBNyJblgqKo$> . You are receiving this because your review was requested.Message ID: @.***> -- Hong Guan, MS School of Computing and Augmented Intelligence Arizona State University

As mentioned in our meeting, you should not run the command in the master branch. Instead, you should checkout my branch in this pull request and then try the mentioned steps.