mlr-org / paradox

ParamHelpers Next Generation
https://paradox.mlr-org.com
GNU Lesser General Public License v3.0
28 stars 7 forks source link

Hyphens in parameter ids #295

Closed damirpolat closed 4 years ago

damirpolat commented 4 years ago

I'm integrating XMeans learner from RWeka package. However I cannot create paradox parameters because they contain hyphens in their names.

Parameter names (see use-kdtree)

> library(RWeka)
> WOW(XMeans)

-I <num>
        maximum number of overall iterations (default 1).
    Number of arguments: 1.
-M <num>
        maximum number of iterations in the kMeans loop in the Improve-Parameter part (default 1000).
    Number of arguments: 1.
-J <num>
        maximum number of iterations in the kMeans loop for the splitted centroids in the Improve-Structure part
        (default 1000).
    Number of arguments: 1.
-L <num>
        minimum number of clusters (default 2).
    Number of arguments: 1.
-H <num>
        maximum number of clusters (default 4).
    Number of arguments: 1.
-B <value>
        distance value for binary attributes (default 1.0).
    Number of arguments: 1.
-use-kdtree
        Uses the KDTree internally (default no).
-K <KDTree class specification>
        Full class name of KDTree class to use, followed by scheme options.  eg:
        "weka.core.neighboursearch.kdtrees.KDTree -P" (default no KDTree class used).
    Number of arguments: 1.
-C <value>
        cutoff factor, takes the given percentage of the splitted centroids if none of the children win (default 0.0).
    Number of arguments: 1.
-D <distance function class specification>
        Full class name of Distance function class to use, followed by scheme options.  (default
        weka.core.EuclideanDistance).
    Number of arguments: 1.
-N <file name>
        file to read starting centers from (ARFF format).
    Number of arguments: 1.
-O <file name>
        file to write centers to (ARFF format).
    Number of arguments: 1.
-U <int>
        The debug level.  (default 0)
    Number of arguments: 1.
-Y <file name>
        The debug vectors file.
    Number of arguments: 1.
-S <num>
        Random number seed.  (default 10)
    Number of arguments: 1.
-output-debug-info
        If set, clusterer is run in debug mode and may output additional info to the console
-do-not-check-capabilities
        If set, clusterer capabilities are not checked before clusterer is built (use with caution).

Here's what I'm trying to do:

> library(paradox)
> ps = ParamSet$new(
    params = list(
      ParamLgl$new(id = "use-kdtree", default = FALSE, tags = "train"),
    )
  )

Error in assert_id(id) : Assertion on 'id' failed: Must comply to pattern '^[[:alpha:]]+[[:alnum:]_.]*$'.

Should I try to get a workaround for this within the learner script or will paradox support hyphens in parameter names in the future?

Thanks, Damir

jakob-r commented 4 years ago

We could allow "-" characters but it would imply that we have data.tables with columnames that contain "-" which can be problematic. Therefore, I would suggest not to use special characters but instead:

Simply call such parameters use_kdtree and solve the name difference in the train() method.

jakob-r commented 4 years ago

I will close this issue because we won't allow "-" as parameter ids.