ContinuumIO / xarray_filters

A Pipeline approach to chaining common xarray data structure conversions
3 stars 10 forks source link

Default arguments for xarray_filters.datasets.make_* functions #17

Open PeterDSteinberg opened 7 years ago

PeterDSteinberg commented 7 years ago

@gpfreitas This is related to issue #5 and #6 and tries to condense them into a TODO list.

Items to do related to the argument specs of make_* functions from xarray_filters.datasets:

In [3]: ?make_blobs
Signature: make_blobs(n_samples=100, n_features=2, centers=3, cluster_std=1.0, center_box=(-10.0, 10.0), shuffle=True, random_state=None, *, astype='dataset', **kwargs)
Docstring:
Like sklearn.datasets.samples_generator.make_blobs, but with added functionality.

Parameters
---------------------
Same parameters/arguments as sklearn.datasets.samples_generator.make_blobs, in addition to the following
keyword-only arguments:

astype: str
    One of ('array', 'dataframe', 'dataset', 'mldataset') or None to return an NpXyTransformer. See documentation
    of NpXyTransformer.astype.

**kwargs: dict
    Optional arguments that depend on astype. See documentation of
    NpXyTransformer.astype.

Note - where I said dask_glm above - also look at dask-ml

PeterDSteinberg commented 7 years ago

Other TODOs I need to add:

gpfreitas commented 7 years ago

I think letting shape be a dict should be enough for letting the user customize dimension names.

So, what's left is the harder part:

For astype, @PeterDSteinberg, we should leave the to_* methods intact, right? So, passing astype='numpy.ndarray' would call XyTransformer.to_array. Sounds good?

gpfreitas commented 7 years ago

Working on the dask-glm support.

PeterDSteinberg commented 7 years ago

Note the dask-ml / dask-glm related work is being addressed in a separate issue: https://github.com/ContinuumIO/xarray_filters/issues/36