dandi / dandi-cli

DANDI command line client to facilitate common operations
https://dandi.readthedocs.io/
Apache License 2.0
22 stars 28 forks source link

Add "publish" (AKA API) service to testing docker compose setup #164

Closed yarikoptic closed 4 years ago

yarikoptic commented 4 years ago

There was an attempt to provide a unified docker compose setup but it was not finished: https://github.com/dandi/dandi-publish/pull/71 .

To test new download of #134 without contacting the mothership deployment, and any other future interaction with dandiarchive through API we will need to improve upon docker compose recipe of https://github.com/dandi/dandi-cli/pull/159 to include also the publish/API service.

jwodder commented 4 years ago

My attempts to get the Django container to run (using the latest code from dandi-publish@master) are currently failing with the following in the container's logs:

django-configurations version , using configuration DevelopmentConfiguration
[18:29:28] INFO     Watching for file changes with             autoreload.py:612
                    StatReloader                                                
Performing system checks...

Exception in thread django-main-thread:
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.8/site-packages/django/utils/autoreload.py", line 53, in wrapper
    fn(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/django/core/management/commands/runserver.py", line 118, in inner_run
    self.check(display_num_errors=True)
  File "/usr/local/lib/python3.8/site-packages/django/core/management/base.py", line 392, in check
    all_issues = checks.run_checks(
  File "/usr/local/lib/python3.8/site-packages/django/core/checks/registry.py", line 70, in run_checks
    new_errors = check(app_configs=app_configs, databases=databases)
  File "/usr/local/lib/python3.8/site-packages/django/core/checks/urls.py", line 13, in check_url_config
    return check_resolver(resolver)
  File "/usr/local/lib/python3.8/site-packages/django/core/checks/urls.py", line 23, in check_resolver
    return check_method()
  File "/usr/local/lib/python3.8/site-packages/django/urls/resolvers.py", line 408, in check
    for pattern in self.url_patterns:
  File "/usr/local/lib/python3.8/site-packages/django/utils/functional.py", line 48, in __get__
    res = instance.__dict__[self.name] = self.func(instance)
  File "/usr/local/lib/python3.8/site-packages/django/urls/resolvers.py", line 589, in url_patterns
    patterns = getattr(self.urlconf_module, "urlpatterns", self.urlconf_module)
  File "/usr/local/lib/python3.8/site-packages/django/utils/functional.py", line 48, in __get__
    res = instance.__dict__[self.name] = self.func(instance)
  File "/usr/local/lib/python3.8/site-packages/django/urls/resolvers.py", line 582, in urlconf_module
    return import_module(self.urlconf_name)
  File "/usr/local/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 783, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/opt/django/dandi/urls.py", line 7, in <module>
    from rest_framework_extensions.routers import ExtendedSimpleRouter
  File "/usr/local/lib/python3.8/site-packages/rest_framework_extensions/routers.py", line 2, in <module>
    from rest_framework_extensions.utils import compose_parent_pk_kwarg_name
  File "/usr/local/lib/python3.8/site-packages/rest_framework_extensions/utils.py", line 6, in <module>
    from rest_framework_extensions.key_constructor.constructors import (
  File "/usr/local/lib/python3.8/site-packages/rest_framework_extensions/key_constructor/constructors.py", line 4, in <module>
    from rest_framework_extensions.key_constructor import bits
  File "/usr/local/lib/python3.8/site-packages/rest_framework_extensions/key_constructor/bits.py", line 3, in <module>
    from django.db.models.sql.datastructures import EmptyResultSet
ImportError: cannot import name 'EmptyResultSet' from 'django.db.models.sql.datastructures' (/usr/local/lib/python3.8/site-packages/django/db/models/sql/datastructures.py)

This is apparently a Django 3.1 compatibility problem which the responsible library (drf-extensions) has been updated for but not yet made a release with yet.

jwodder commented 4 years ago

So after editing dandi-publish's setup.py to require django<3.1, Django comes up without any fatal errors, but http://localhost:8000 is a 404 with the following text:

Using the URLconf defined in dandi.urls, Django tried these URL patterns, in this order:

  1. __debug__/
  2. api/
  3. admin/
  4. swagger/ [name='schema-swagger-ui']
  5. redoc/ [name='schema-redoc']

The empty path didn't match any of these.

yarikoptic commented 4 years ago

May be @dandi/dandiarchive would have an immediate clue/recommendation.

jwodder commented 4 years ago

@yarikoptic Did that ping anyone to come here, or are you suggesting I ask in the dandiarchive team discussion (or somewhere else)?

yarikoptic commented 4 years ago

just "to come here" ... indeed it was vague, sorry.

yarikoptic commented 4 years ago

but indeed could be also asked on slack in "internal"

mgrauer commented 4 years ago

I'm lurking here on dandi-cli for now, I'm not sure if I got pinged by the @dandi/dandiarchive or not. The right thing to do to get help for this problem was to post an issue in dandi-publish, as @jwodder did.

yarikoptic commented 4 years ago

Apparently you just weren't a part of the team @mgrauer ;-) you are now! See others and possibly remove former members on https://github.com/orgs/dandi/teams/dandiarchive/members

Issuing issues for simple questions might be too much, so we better make it work "properly"

jwodder commented 4 years ago

Status: I'm able to get the docker-publish Django container to come up, and I can connect it to Girder using a "publish" admin user (At least, I think it's connected; I don't know how to test that.) I can't get Girder to connect to Django, as that requires getting an API key from Django, which is currently a GUI-only operation. Also, the celery container is exiting, but I don't know how important that is.

Using this setup requires building two more custom images; one (which requires a rebuild everytime docker-publish changes) uses a modified version of the Dockerfile currently in docker-publish@master, and the other uses a Dockerfile that only exists in the PR you linked me to. Where should I store these Dockerfiles so that Docker Hub can build them?

Also, should this setup be used for the local_docker_compose fixture or a new fixture?

yarikoptic commented 4 years ago

Also, should this setup be used for the local_docker_compose fixture or a new fixture?

I think it should all go into the same fixture, which purpose is "to bring up all necessary components dandi-cli would interact with as with a real main deployment". IMHO that would be simpler and more "realistic" although not modular and thus cause longer tests session fixture setup time even if not all components get used by the tests.

@mgrauer , does dandi-publish require anything "custom" (not present in the current/default girder deployment) functionality to be deployed alongside whenever dandi-publish gets deployed and https://github.com/dandi/dandiarchive/pull/341 (web ui, we don't test/interact with it here, not bringing up in compose recipe for testing)? If so - that might preclude proceeding with adding "publish" to that fixture, but if not -- should be ok I think.

mgrauer commented 4 years ago

does dandi-publish require anything "custom" (not present in the current/default girder deployment) functionality to be deployed alongside whenever dandi-publish gets deployed and dandi/dandiarchive#341 (web ui, we don't test/interact with it here, not bringing up in compose recipe for testing)? If so - that might preclude proceeding with adding "publish" to that fixture, but if not -- should be ok I think.

Sorry, I'm not sure what you are asking here.

It seems like you are going through a lot of work--is all of this work necessary? I'd love to save you from so much work if possible, but I'm not following what you are doing.

Thinking out loud: Why do you specifically need both of the Docker Compose services up for your testing? Which endpoints are you hitting currently from the dandi-publish app? Perhaps you could just mock those few readonly endpoints out rather than having to run a full dandi-publish app and connect it to Girder and run a publish task? Also with the consideration that the django service is going to be changing quite a bit and we'd like to migrate into a unified system, whatever work you are putting into this now will also have to change to keep up--it seems like a large maintenance burden to me and I'd like you to be aware of that. Is there a lower cost path you could pursue for now? I don't understand enough of what your goals are to evaluate the tradeoffs.

yarikoptic commented 4 years ago

In brief (can reply in detail or we could have a jitsi chat), overall goal: to be able to test dandi-cli functionality. dandi client is useless without archive. For upload we need to test against a live instance. We don't want to butcher/conflict/etc our main instance. Mocking can get us half way, but will not guarantee that upload would work in real life. The same goes for locking. For download we could test on real instance (like we already do) but makes testing in simulated tricky scenarios "tricky" and causes unnecessary requirement for network connectivity. As more API developed, we need such integration testing even more. Especially in the current turbulent development where things change and break.

As things progress, and there would be more instances of dandi archive, client would need to be tested against multiple deployment versions/scenarios.

My question was: can we use dandi-publish component with plain as deployed instance of girder, or it requires some special instance of girder?

mgrauer commented 4 years ago

Thanks for clarifying your testing needs.

My question was: can we use dandi-publish component with plain as deployed instance of girder, or it requires some special instance of girder?

If you have an instance of dandi-publish it should in theory work with any instance of dandiarchive (girder + dandiarchive plugin), I don't think anything special is needed. You will need to have bidirectional authentication set up.

dandiarchive exposes a publish endpoint that requires a user named "publish" with admin permissions to use. it also relies on two settings, PUBLISH_API_URL ("dandi.publish_api_url" is the DB setting key) and PUBLISH_API_KEY("dandi.publish_api_key" is the DB setting key) to talk to the publish service. You can set these system settings through an endpoint like this. You need to get the token manually from publish. Here's info on how to connect from publish into dandiarchive.

The dandiarchive publish endpoint first calls sync on the publish service (which calls back into Girder with a synchronous operation). dandiarchive publish endpoint then calls /versions/publish on dandi-publish, which launches an async operation in Celery to copy all of the metadata and data from Girder into the publish service.

yarikoptic commented 4 years ago

note: call out to publish API endpoint might (or not) happen while preparing tests for https://github.com/dandi/dandi-cli/pull/189 where the only DANDI API for download available is only for published datasets, so it would need to be published first within tests fixture.

yarikoptic commented 4 years ago

@mgrauer could you or your team provide a complete recipe to be able to establish or get that key within docker compose setup? In the long run: there should be a single key and some "permissions" levels I guess (to allow or disallow publishing by users) https://github.com/dandi/dandi-publish/issues/135

yarikoptic commented 4 years ago

ok, token could be easily obtained by running (XXXX replaces sensitive output)

$> docker exec 4efc ./manage.py drf_create_token root
Generated token XXXX for user root

where 4efc is the container of dandi-publish_django . And verified that it gives access:

$> curl -X POST "http://localhost:8000/api/dandisets/000001/versions/publish/" -H "accept: application/json" -H "Content-Type: application/json" -H "Authorization: Token BOGUS" -d "{ \"dandiset\": {\"identifier\": \"000001\"}, \"version\": \"1\", \"name\": \"name string\", \"description\": \"string\"}"
{"detail":"Invalid token."}%                                                                                                                                                                                                       
$> curl -X POST "http://localhost:8000/api/dandisets/000001/versions/publish/" -H "accept: application/json" -H "Content-Type: application/json" -H "Authorization: Token XXXX" -d "{ \"dandiset\": {\"identifier\": \"000001\"}, \"version\": \"1\", \"name\": \"name string\", \"description\": \"string\"}"
{"detail":"Not found."}%     
jwodder commented 4 years ago

The image for the minio container (which the Django container depends on) is currently failing to build. I posted a question about it in dandi/dandi-publish#71.

jwodder commented 4 years ago

I'm trying to get dandiarchive to connect to dandi-publish following these instructions, but the following command:

curl -fsSL -H "Girder-Token: $PUBLISH_API_KEY" -X PUT \
    -F key=dandi.publish_api_url \
    -F value=http://localhost:8000/api/ \
    http://localhost:8080/api/v1/system/setting

is currently failing with a 400 error and the message:

{"field": "key", "message": "Invalid setting key \"dandi.publish_api_url\".", "type": "validation"}

The same error is returned when attempting to set the setting via the Swagger UI.

I've created an issue on dandiarchive: https://github.com/dandi/dandiarchive/issues/494

jwodder commented 4 years ago

@yarikoptic I think I've finally got the prototype working now; is there a way to check whether the dandiarchive+dandi-publish connection is working as intended?

yarikoptic commented 4 years ago

well, "dandiarchive" == "webui" + "girder" + ... . In dandi-cli I guess we do not care about webui -> dandi-publish connection, but if you were to test it -- I guess what they have https://github.com/dandi/dandiarchive/tree/master/test might do it. We should not worry about that portion too much ATM and leave it to "dandiarchive" people. We do care about dandi-publish -> girder so I guess the best would be if provisioning script (if it doesn't already) would just issue a /dandisets/{dandiset__pk}/versions/publish/ request on some dandiset with a known version, and a test would obtain assets listing for that published version of that dandist. Or did I misunderstand "connection" you are after? ;)

jwodder commented 4 years ago

I just need this PR merged, and then I'll set up building a "publish" image on Docker Hub, and then this issue should finally be done.