uncharted-distil / distil

An analytic workbench for user-guided development of model pipelines
Apache License 2.0
13 stars 3 forks source link

distil

CircleCI Go Report Card

Related Projects

Dependencies

Development

Clone the repository:

mkdir -p $GOPATH/src/github.com/uncharted-distil
cd $GOPATH/src/github.com/uncharted-distil
git clone git@github.com:unchartedsoftware/distil.git
cd distil

Install dependencies:

make install

Install datasets:

Datasets are stored using git LFS and can be pulled using the datasets.sh script.

./datasets.sh

To add / remove a dataset modify the $datasets variable:

declare -a datasets=("185_baseball" "LL0_acled" "22_handgeometry")

Generate code (optional):

To regenerate the PANDAS dataframe parser if the api/compute/result/complex_field.peg file is changed, run:

make peg

Docker images:

The application requires:

Docker images for each are available at the following registry:

docker.uncharted.software
Login to Docker Registry:
sudo docker login docker.uncharted.software
Update docker-compose.yml
---
distil-auto-ml:
  image: docker.uncharted.software/distil-auto-ml

Pull Images:

Pull docker images via Docker Compose:

./update_services.sh

Running the app:

Using three separate terminals:

Terminal 1 - Launch docker containers via Docker Compose:
./run_services.sh
Terminal 2 - Build and watch webapp:
yarn watch

The app will be accessible at localhost:8080.

Terminal 3 - Build, watch, and run server:
make watch

Advanced Configuration

The location of the dataset directory can be changed by setting the D3MINPUTDIR environment variable, and the location of the temporary data written out during model building can be set using the D3MOUTPUTDIR environment variable. The host IP address of the docker containers if not localhost can be set with DOCKER_HOST. (i.e.export DOCKER_HOST=192.168.0.10 && make watch.) These are used by the other Distil services that are launched via the run_services.sh script, and are typically set as global environment variables in .bashrc or similar.

Linter Setup

VSCODE

For the VsCode editor download and install the eslint extension. Once installed go to the editor settings (hot key ⌘⇧p -- type settings) Add the following to your settings file:

  "eslint.lintTask.enable": true, // enable eslint to run
  "eslint.validate": [
    "vue", // tell eslint to read vue files
    "html", // tell eslint to read html files
    "javascript", // tell eslint to read javascript files
    "typescript" // tell eslint to read typescript files
  ],
  "eslint.workingDirectories": [{ "mode": "auto" }], // eslint will try its best to figure out the working directory of the project

At this point save your settings file and restart VsCode. If upon restarting and the linter is not working check the output (^⇧` -- OUTPUT tab -- dropdown -- ESlint)

Common Issues:

"../repo/subpackage/file.go:10:2: cannot find package "github.com/company/package/subpackage" in any of":

"# pkg-config --cflags -- gdal gdal gdal gdal gdal gdal Package gdal was not found in the pkg-config search path."

Mac

runtime error while training "joblib.externals.loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker."