kubeflow / pipelines

Machine Learning Pipelines for Kubeflow
https://www.kubeflow.org/docs/components/pipelines/
Apache License 2.0
3.6k stars 1.62k forks source link

Empty files retrieved from minio through output visualizer. #1078

Closed illeatmyhat closed 4 years ago

illeatmyhat commented 5 years ago

I'm testing the output visualizer, and the pipelines/artifact/get endpoint returns nothing but empty files for valid keys (proper errors are returned in other cases). For example, if I execute a pipeline defined like so:

/mnt/workflow/viewer_output.json:
{
  "version": 1,
  "outputs": [
    {
      "type": "table",
      "format": "csv",
      "source": "minio://visualizer-output/some_file.csv",
      "header": ["", "foo", "bar", "baz"]
    }
  ]
}
dsl.ContainerOp(
  name='visualizer-test',
  image='minio/mc',
  command=['sh', '-c', 'cp /mnt/workflow/viewer_output.json /mlpipeline-ui-metadata.json ; '
                       'mc config host add kubeflow http://minio-service.kubeflow:9000 minio minio123 ; '
                       'mc mb kubeflow/visualizer-output ; '
                       'mc cp /mnt/workflow/some_file.csv kubeflow/visualizer-output/some_file.csv']
).add_volume(V1Volume(name='volume-1',
                      persistent_volume_claim=V1PersistentVolumeClaimVolumeSource(claim_name='nfs'))) \
 .add_volume_mount(V1VolumeMount(mount_path='/mnt/workflow/', name='volume-1'))

And try to retrieve the file...

curl "http://localhost:8080/pipeline/artifacts/get?source=minio&bucket=visualizer-output&key=some_file.csv"

I get nothing. No errors. This results in empty visualizer output, essentially the same as #489

However, accessing the minio UI directly and downloading the file works fine.

illeatmyhat commented 5 years ago

I found the problem. pipeline/artifacts/get only works with tar files, though tar -cvzf seems to retain some header trash from OSX. To resolve, use gnu-tar gtar -cvzf

curl "http://localhost:8080/pipeline/artifacts/get?source=minio&bucket=visualizer-output&key=some_file.tgz"
{"fgs": "fds"}

looks like this is the cause: https://github.com/kubeflow/pipelines/blob/633e2ddcc8bba0aa07cf53ef438ecb58cd7d4fd7/frontend/server/server.ts#L160-L169

neuromage commented 5 years ago

@vicaire this doesn't seem related to viewer crd.

/assign @rileyjbauer Looks like a UI server issue?

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Bobgy commented 4 years ago

Workaround found in https://github.com/kubeflow/pipelines/issues/1078#issuecomment-479698096