iterative / studio-support

❓ DVC Studio Issues, Question, and Discussions
https://studio.iterative.ai
16 stars 1 forks source link

Table data columns: show more information about imported data (`rev`, `source`, etc) #16

Open daavoo opened 3 years ago

daavoo commented 3 years ago

When a data file has been imported/updated using the --rev option of dvc import/dvc update , a rev subfield is being added to the .dvc file (See Example: Importing and updating fixed revisions).

I think that, in cases where it exists, showing or allowing to show the value of rev could be more useful than size which is the one being currently displayed in the "column view".

shcheklein commented 3 years ago

Thanks @daavoo ! It makes sense. Columns already have different possible representations:

Screen Shot 2021-06-21 at 11 46 13 AM

Also tooltips:

Screen Shot 2021-06-21 at 11 47 34 AM

Neither of those cover revisions and imported dataset. It makes sense to mention the source as well I think, not only revision.

I think we can display it as part of the tooltip. And/or make another "file parameter" in the column dropdown. WDYT?

tapadipti commented 3 years ago

I think we can display it as part of the tooltip. And/or make another "file parameter" in the column dropdown. WDYT?

@daavoo any thoughts on this?

daavoo commented 3 years ago

I think we can display it as part of the tooltip. And/or make another "file parameter" in the column dropdown. WDYT?

@daavoo any thoughts on this?

Tooltip and the new "file parameter" would be nice.

I would also consider making rev the default field to display (if it exists) as it probably means that it has some meaningful information for the user (otherwise they won't be using it).

mvshmakov commented 3 years ago

@Suor do we already parse the rev and source fields and are we able to serve them to FE?

Suor commented 3 years ago

No we don't. We have the same fields for all file fields now - hash, size, nfiles, flags, imports are not distinguished.

erudin commented 2 years ago

@daavoo what are you referring to with source info, this is the same as the remote?

daavoo commented 2 years ago

@daavoo what are you referring to with source info, this is the same as the remote?

I didn't use the term source but I think it refers to the url field that gets stored in .dvc file when using dvc import (@shcheklein can confirm if this is what he was referring to):

The url argument specifies the address of the DVC or Git repository containing the data source.

This field gets created for imported .dvc files, regardless of whether the --rev option was used.

So, running:

dvc import git@github.com:iterative/dataset-registry.git use-cases/cats-dogs --rev 'cats-dogs-v1'

Creates a cats-dogs.dvc with the following content:

 md5: 7ff366c716a376ec009054f1c141dc17
frozen: true
deps:
- path: use-cases/cats-dogs
  repo:
    url: git@github.com:iterative/dataset-registry.git
    rev: cats-dogs-v1
    rev_lock: 0547f5883fb18e523e35578e2f0d19648c8f2d5c
outs:
- md5: b6923e1e4ad16ea1a7e2b328842d56a2.dir
  size: 41149064
  nfiles: 1800
  path: cats-dogs

ref: https://dvc.org/doc/command-reference/import#example-importing-and-updating-fixed-revisions

erudin commented 2 years ago

Thanks for the explanation and clarification, I'm trying to understand which information should be shown besides rev and url (in the case that it refers to source)

Suor commented 2 years ago

@erudin let's save everything under that repo key. Let FE decide what to show.

tapadipti commented 2 years ago

@daavoo For a given imported file, the url will remain the same. So, it should be enough to display only rev, right? Or should we display url also?

Suor commented 2 years ago

If we don't show url where one will see it at all? It's true that it will be the same for the column usually - it is still possible that import be removed and readded with a different url to the same place.

daavoo commented 2 years ago

@daavoo For a given imported file, the url will remain the same. So, it should be enough to display only rev, right? Or should we display url also?

Agree with @Suor comment above. Even if the value doesn't change across rows, the url is valuable information that should be visible