yt-project / yt

Main yt repository
http://yt-project.org
Other
461 stars 276 forks source link

Automatically create data sources as necessary from arrays, dicts, dataframes #1499

Open matthewturk opened 7 years ago

matthewturk commented 7 years ago

A common operation that people want to do is, for instance, something like "Give me a phase plot of this thing!" But they may not have gone through the entire process of building out a dataset through load_uniform_grid and then selecting all_data() but it's just as valid.

One straightforward way of addressing this would be to build out a mechanism for automatically converting guessable formats into datasets. This would work nicely for things like phase plots and profile plots, but if the appropriate arrays and records are available, it could be just as useful to do this for projections and slices as well.

As a few examples:

For projection and slice plots, it could be something like looking for records like x, etc, and then creating the dataset based on the extent.

The endpoint might look something like:

my_data = dict( x = ..., y = ..., z = ..., density = ..., temperature = ...)
yt.PhasePlot(my_array, "density", "temperature")

This would substantially lower the barrier to entry for using yt functionality, and would not be limited to plots; this could also work for other top-level operations.

ngoldbaum commented 7 years ago

This would require adding more input validation in the public API (e.g. the example in the proposal would require code in PhasePlot to convert arbitrary python objects into datasets). I worry about missing places in the API that currently expect a dataset or data object but with this proposal would accept many other types of objects. It's important to keep the API consistent. I also worry about our API seeming too "magical".

What about making yt.load() return a dataset if you pass it one of these formats? I think that would require fewer changes to top-level yt API classes and functions, since they would still only accept datasets.

matthewturk commented 7 years ago

I'd contend that this would be a nice way of reducing the boilerplate code -- without necessarily making it considerably more magical, if we restricted it to a handful of places (such as just the visualization commands). I do think that having yt.load() do this would also be of use.