Size of task prototype - Githubissues

mlr-org / mlr3fda

Functional Data Analysis for mlr3

https://mlr3fda.mlr-org.com/

GNU Lesser General Public License v3.0

4 stars 2 forks source link

Size of task prototype #52

Open sebffischer opened 1 year ago

sebffischer commented 1 year ago

When creating a graph learner that has as input a task with tf columns, the data_prototype that is saved in the learner's state after training contains the arg and value vectors as well as the evaluator and other metadata defined in tf. This unnecessarily blows up the size of learner states in a way that was not intended.

I think this should be fixed in tf, i.e. 0-lentgh tf vectors should drop discardable metadata.

sebffischer commented 11 months ago

With recent PRs in mlr3 and mlr3misc, this problem should be mostly mitigated:

sebffischer commented 9 months ago

As we now decided to not merge the warning, we should do something about this. E.g. in mlr3 when creating the data prototype during $train(), it should be possible to add a function that leanifies each column, e.g. stored in mlr_reflections$data_leanifier$tf. This function would then remove the srcref attribute from the functional columns to avoid overly large object sizes when installing with sourcerefs. In the resample() case this is no problem because the prototype is not kept in the learner state.

sebffischer commented 9 months ago

[ ] Merge generic strip_srcref to mlr3misc: https://github.com/mlr-org/mlr3misc/pull/105/files and release
[ ] Merge https://github.com/mlr-org/mlr3/pull/1002 and release
[ ] implement strip_srcref.tfd_reg and strip_srcref.tfd_irreg methods in mlr3fda.