alibaba / pipcook

Machine learning platform for Web developers
https://alibaba.github.io/pipcook/
Apache License 2.0
2.55k stars 209 forks source link

core: rename api for better semantic #781

Closed FeelyChau closed 3 years ago

FeelyChau commented 3 years ago

Pipeline:

Runtime:

yorkie commented 3 years ago

And I'm -1 on renaming datasource to the dataset, the data source actually represents a data schema for multiple datasets.

FeelyChau commented 3 years ago

And I'm -1 on renaming datasource to the dataset, the data source actually represents a data schema for multiple datasets.

If you mean the datasource type will be changed, I think we should confirm it before 2.0 releasing. Otherwise, if the interface type is exactly equal to Dataset, I think the name dataset is better, model script should not care the dataflow.

yorkie commented 3 years ago

Dataset is the concept of datacook, the data source is Pipcook's.

FeelyChau commented 3 years ago

Dataset is the concept of datacook, the data source is Pipcook's.

What's the different except the name? In my opinion, they are the same.

yorkie commented 3 years ago

Good question @FeelyChau, and I searched some for it:

In summary, they are not the same thing, and Data Source is a place where we fetch data or dataset, and the Dataset is a set of data, which represents the data itself. Therefore I'm agreed with using "dataset" :)