Open mahiki opened 2 months ago
A key part of this is that currently read(::Dataset)
is defined as a read_path
function attached to a prefect block, which has a very Object Oriented structure.
The way I'm using Dataset is its just a metadata reference, mostly carrying filepath locations and local/remote labels.
I do not want to define a 'dataset' with a block, the only prefect block reference needed is the base path to the data store.
Again, this was borrowed from the way Prefect file blocks included a read_path/write_path object method which creates too much linkage between Prefect internal object-oriented structure and the structure of my data application.
I guess this is called a 'leaky abstraction'
When you are working off of local datastores only its a bit clunky to have to define the connection the Prefect API and define the names of remote and local Prefect blocks.
Until you can define flow code in julia scripts there's no upside to the prefect integration, since you are writing your flow code in python and calling a julia process.
Example local julia exploratory use-case:
If 'Dataset' module (name already taken) could stand alone from PrefectInterfaces, you could bring it on as an extention when needed. In stand alone mode, you'll need to define the filesystem block instead of calling the API url to get that:
And thats all you need to find datasets in your local system. You are working in julia outside of any prefect orchestration.