dandi / dandi-cli

DANDI command line client to facilitate common operations
https://dandi.readthedocs.io/
Apache License 2.0
21 stars 25 forks source link

upload: establish testing #27

Closed yarikoptic closed 4 years ago

yarikoptic commented 4 years ago

26 introduced upload command. We should enable some kind of unittesting for it

It seems that github actions support docker: https://help.github.com/en/articles/creating-a-docker-container-action and even I found an issue in a now removed "docker" action repository issues: https://webcache.googleusercontent.com/search?q=cache:mPPb1xbgUgEJ:https://github.com/actions/docker/issues/11+&cd=2&hl=en&ct=clnk&gl=us which suggests that docker compose is also available.

But we would need firsts to establish some user creds and obtain the key programmatically first. @mgrauer - can you help with that?

mgrauer commented 4 years ago

We are working on a docker compose setup for the dandiarchive, which might be a solution here. That might also end up being too heavyweight and perhaps we should just mock a few endpoints instead. After we finish that PR I can think about this issue more.

yarikoptic commented 4 years ago

There is a working docker compose setup, so it is time to establish within tests (on CI or not) a fixture which would use docker-compose to start an instance of the archive. Possibly of help and code to adopt: within reproman we have "fixture generators" https://github.com/ReproNim/reproman/blob/master/reproman/tests/fixtures.py#L23 to create fixtures needing specific docker or singularity container etc. May be could come useful here.

jwodder commented 4 years ago

@yarikoptic The images for the docker-compose setup are built by copying in the dandiarchive source, so either dandi-cli's GitHub actions need to check out a copy of dandiarchive (not the cleanest idea) or else the dandiarchive images need to be pushed to Docker Hub or a similar registry.

yarikoptic commented 4 years ago

Ok, we already have dandiarchive organization on hub.docker linked to github via dandibot user... I have initiated a build for dandiarchive/dandiarchive-client - let's see if that one works out. I will update whenever I know more. @jwodder - do you have hub.docker.com account? I could add you to dandiarchive organization

jwodder commented 4 years ago

@yarikoptic Yes; username: jwodder

yarikoptic commented 4 years ago

ok, added you to the owners! (as well as developers for that first repo dandiarchive-client). So I have prepared one for dandiarchive-client, seems to work fine! now up to you @jwodder to link to other components (I have made it to rebuild for master and for any tagged release which are yet to come in dandiarchive).

jwodder commented 4 years ago

@yarikoptic The "girder" image should be uploaded as well, as that's built by copying in the girder-dandi-archive directory from dandiarchive. That just leaves the "provision" image; it'd be easiest if that was uploaded to Docker Hub too.

yarikoptic commented 4 years ago

@yarikoptic The "girder" image should be uploaded as well, as that's built by copying in the girder-dandi-archive directory from dandiarchive. That just leaves the "provision" image; it'd be easiest if that was uploaded to Docker Hub too.

just to make sure re "uploaded". Did you upload it or it was built on docker hub with linkage to the repository so whenever we push new changes it would get rebuilt?

(now I wonder -- would docker hub react only to changes to Dockerfile(s) or to any commit to rebuild "latest"?)

jwodder commented 4 years ago

I did not upload anything to Docker Hub, as I wasn't sure as to the best way to go about it (though I believe I could set up automated builds of the other images).

I believe Docker Hub rebuilds on all commits, regardless of where they are in the repository. I've also seen Docker Hub keep a triggered build queued for over three hours, so, if possible, we might want to create an action to build images and push them to the Hub rather than relying on the Hub's infrastructure.

yarikoptic commented 4 years ago

I did not upload anything to Docker Hub, as I wasn't sure as to the best way to go about it (though I believe I could set up automated builds of the other images).

sorry I was not clear in my description of desired actions!

I do not see girder component build setup on the hub so I did it while recording a screen cast. Unfortunately middle portion where I actually added a triggered build was not recorded :-( FWIW here is two parts:

, if possible, we might want to create an action to build images and push them to the Hub rather than relying on the Hub's infrastructure.

yeap, in the long(er) run it might be desired. For now, and for tagged releases of the dandiarchive, getting some images built automagically by docker hub, even with a delay, would suffice IMHO. But I would not stop anyone to prepare a proper action.

yarikoptic commented 4 years ago

Now that #159 is merged, and we had some small fall-outs due to absent upload testing, I am giving this issue an explicit urgency so that upcoming release would have upload tested at least to some minimal degree.

jwodder commented 4 years ago

@yarikoptic Should the tests just run upload and then check that the uploaded files exist on the Dandi instance, or should they be more in-depth somehow?

It looks like upload will have to be converted into a function that the command calls (in order to make it possible to specify the Dandi instance without depending on DANDI_DEVEL); should this function return anything (say, progress information for the caller to pass to pyout if it so chooses) or not?

yarikoptic commented 4 years ago

For now even the simple upload you outlined would suffice. May be doing it a few times as a smoke test for handling --existing option.

For now it doesn't need to return anything. We will need to decide on upload/download/... return (or making them into generators even) later on consistently across commands based on the target needs/use cases. That should also later allow for better testing of different options (eg how to handle existing files).

jwodder commented 4 years ago

@yarikoptic The pyout.Tabular constructor is failing in the tests because curses.setupterm() "could not find terminal". This doesn't happen if the -s (--capture=no) option is removed from the pytest invocation.

Once I removed the -s option, the upload test failed because the post-upload check couldn't find the uploaded file. Is the below code the correct way to check that a file sub-anm369962/sub-anm369962_ses-20170309.nwb that has been uploaded from a local Dandiset with id 000006 exists on the Dandi instance?

girder.lookup(
    client,
    collection_drafts,
    path=f"{dandi_id}/sub-anm369962/sub-anm369962_ses-20170309.nwb",
)
jwodder commented 4 years ago

I've fixed the above problem; I forgot to chdir. dandi upload should probably fail when given a path that doesn't exist.

yarikoptic commented 4 years ago

re pyout: I would file an issue on https://github.com/pyout/pyout/issues since AFAIK it should have automagically determine if no terminal is available and switch to a "static renderer" like I believe it does already whenever there is no TTY.

as for test either it was uploaded, although it would bind with download - I would just download into a new temp folder and thus verify full round-trip. The reason is - I do not want tests to assume girder as the backend.

In #134 I am unifying a bit "list assets on the remote server" so we could later may be use that instead of full download.


typed above and the "fixed above" message appeared... will keep the comment anyways. And indeed upload should fail one way or another if any file upload has failed (e.g. pointed to non-existing path, or just failed to upload). I think it is along the lines of figuring out the returned value(s) etc. ATM it doesn't do any of that indeed besides reporting in its output to the human (which was primary target use-case).

jwodder commented 4 years ago

re pyout: I would file an issue on https://github.com/pyout/pyout/issues since AFAIK it should have automagically determine if no terminal is available and switch to a "static renderer" like I believe it does already whenever there is no TTY.

I tried creating an MVCE for the problem, and the error disappeared; I can't figure out how to make it happen again without copying all of dandi-cli.

as for test either it was uploaded, although it would bind with download - I would just download into a new temp folder and thus verify full round-trip. The reason is - I do not want tests to assume girder as the backend.

The download() function requires a URL containing the Girder ID of the uploaded file. There doesn't seem to be a way to avoid Girder at the moment.

yarikoptic commented 4 years ago

I tried creating an MVCE for the problem, and the error disappeared; I can't figure out how to make it happen again without copying all of dandi-cli.

the "best" kind of problems... I would still report (even though without MVCE) it against pyout referencing relevant message here, @kyleam might have an immediate clue just based on the fact that we ran it via python -m pytest -s v

The download() function requires a URL containing the Girder ID of the uploaded file. There doesn't seem to be a way to avoid Girder at the moment.

right! forgot that we do not interface instances there... Please proceed with the direct girder interface for now. I filed a separate https://github.com/dandi/dandi-cli/issues/170 which should allow us to setup "fuller" test deployment which would allow to test downloads from it via urls mapping done by the redirector ATM.