ndphillips / FFTrees

An R package to create and visualise fast-and-frugal decision trees (FFTs)
https://journal.sjdm.org/17/17217/jdm17217.pdf
135 stars 23 forks source link

Train / Persist / Restore / Predict #204

Closed baddoxx closed 11 months ago

baddoxx commented 1 year ago

is there a best-practice way to train a model, serialize it with a view to restoring it for later predictions?

It seems I would just need the df from fft$trees$definitions (or a words representation) but I don't see a way to create a new FFT without specifying the training data? I don't think I need any training or testing data once I have the tree definition..

appreciate your advice - best Richard

pa-nathaniel commented 1 year ago

I believe you want to use the tree.definitions argument to FFTrees().

The last time I used the package, you could supply an existing fft$trees$definitions dataframe to FFTrees(tree.definitions) and it will bypass creating new trees and instead use that one.

I haven't tried in a while though and I know @hneth will have the latest info.

pa-nathaniel commented 1 year ago

@baddoxx please also check out the following vignette sections to see if that answers your question https://github.com/ndphillips/FFTrees/blob/master/vignettes/FFTrees_mytree.Rmd#L297-L411

hneth commented 1 year ago

Hi @baddoxx,

Nathaniel's hunch is correct, of course, but here's a more convenient way to access the relevant vignette Manually specifying FFTs.

The relevant section is called 2. Using tree.definitions and describes a workflow to get, change, and use FFT definitions in five steps:

  1. Get (sets of) FFT definitions
  2. Select and convert an individual FFT into a tidy data format
  3. Manipulate the FFT (e.g., by changing its nodes or exits)
  4. Re-convert the changed definition into the original data format
  5. Collect sets of changed FFT definitions and/or evaluate them on data

Conceptually, this workflow boils down to first creating an FFT model for some data, then manipulating the model, and finally re-evaluating it on the data.

When only applying a set of existing FFTs to new data (without changing the FFT-definitions), Steps 2–4 can be skipped.

Hope this helps, Hans

baddoxx commented 11 months ago

thanks @pa-nathaniel @hneth - that worked well.

best Richard