galaxyproject / galaxy

Data intensive science for everyone.
https://galaxyproject.org
Other
1.35k stars 979 forks source link

Error in data libraries "import from user directory" function when using external user auth #2812

Open sarah-peter opened 7 years ago

sarah-peter commented 7 years ago

Dear all,

I have my Galaxy instance configured with external user auth through apache. In order to get the API to work from outside (e.g. bioblend), apache is configured to not auth requests to /api. Since I updated to 16.04 with the new data libraries, I get the following error when trying to import datasets "from User Directory", but everything else is working fine:

10.184.132.10 - - [30/Jul/2016:18:09:27 +0200] "POST /api/libraries/datasets?encoded_folder_id=F7b46bd6d01de922f&source=userdir_file&path=160308_WTCHG_254732_201.bam&file_type=auto&dbkey=? HTTP/1.1" 500 - "https://galaxy-server.uni.lu/library/list" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0"
Error - <type 'exceptions.AssertionError'>: use_remote_user is set but HTTP_REMOTE_USER header was not provided
URL: https://galaxy-server.uni.lu/api/libraries/datasets?encoded_folder_id=F7b46bd6d01de922f&source=userdir_file&path=160308_WTCHG_254732_201.bam&file_type=auto&dbkey=?
File '/home/galaxy/galaxy-dist/lib/galaxy/web/framework/middleware/error.py', line 151 in __call__
  app_iter = self.application(environ, sr_checker)
File '/mnt/gaiagpfs/projects/galaxy/internal/.venv/local/lib/python2.7/site-packages/paste/recursive.py', line 85 in __call__
  return self.application(environ, start_response)
File '/home/galaxy/galaxy-dist/lib/galaxy/web/framework/middleware/remoteuser.py', line 76 in __call__
  return self.app( environ, start_response )
File '/mnt/gaiagpfs/projects/galaxy/internal/.venv/local/lib/python2.7/site-packages/paste/httpexceptions.py', line 640 in __call__
  return self.application(environ, start_response)
File '/home/galaxy/galaxy-dist/lib/galaxy/web/framework/base.py', line 131 in __call__
  return self.handle_request( environ, start_response )
File '/home/galaxy/galaxy-dist/lib/galaxy/web/framework/base.py', line 158 in handle_request
  trans = self.transaction_factory( environ )
File '/home/galaxy/galaxy-dist/lib/galaxy/web/framework/webapp.py', line 68 in <lambda>
  self.set_transaction_factory( lambda e: self.transaction_chooser( e, galaxy_app, session_cookie ) )
File '/home/galaxy/galaxy-dist/lib/galaxy/web/framework/webapp.py', line 99 in transaction_chooser
  return GalaxyWebTransaction( environ, galaxy_app, self, session_cookie )
File '/home/galaxy/galaxy-dist/lib/galaxy/web/framework/webapp.py', line 198 in __init__
  self.error_message = self._authenticate_api( session_cookie )
File '/home/galaxy/galaxy-dist/lib/galaxy/web/framework/webapp.py', line 380 in _authenticate_api
  self._ensure_valid_session( session_cookie )
File '/home/galaxy/galaxy-dist/lib/galaxy/web/framework/webapp.py', line 432 in _ensure_valid_session
  "use_remote_user is set but %s header was not provided" % self.app.config.remote_user_header
AssertionError: use_remote_user is set but HTTP_REMOTE_USER header was not provided
martenson commented 6 years ago

@erasche have you ever seen this (yes, you are my external_auth person)

hexylena commented 6 years ago

I have not in recent memory, but 16.04 was a long time ago. It's possible that something is wrong there specifically.

sarah-peter commented 6 years ago

I just want to clarify that external user auth in this case is not the Galaxy built-in LDAP functionality, but LDAP auth through Apache (when I set up our Galaxy server the build-in LDAP was not available yet).

Also the import from user directory functionality of the data libraries needs to be specifically enabled in the configuration and I guess it's not used often, especially not together with authentication via apache.

The problem only occurs if I tell apache to not authenticate requests to the /api paths (but if I don't do that API doesn't work). That's the reason the HTTP_REMOTE_USER header is not set by apache in this case, while the data libraries seem to expect this. For all other parts of Galaxy it's fine, probably because they don't require the HTTP_REMOTE_USER header for API calls.

hexylena commented 6 years ago

Yep, all clear. I used that same method for many years @sarah-peter (before we switched to CAS auth, but also via REMOTE_USER)

I don't see anything obvious in the release notes, nor in a diff from 16.01 to 16.07 about anything that changed in remoteuser.py

In 17.01 we added logging that explains if the header is set incorrectly (e.g. apache's fun (null)@domain) but if you had it working before there's no obvious reason it should have quit working. There shouldn't be anything special about data libs vs everything else, but I'm not very familiar with that API, and these releases were a year ago so I've probably forgotten anything special around them. https://github.com/galaxyproject/galaxy/pull/1801 this might be relevant? Do you have those changes in your instance?

I'm sorry it isn't working!

sarah-peter commented 6 years ago

According to my notes I updated our Galaxy on 2016-07-19 to release_16.04. My file looks a bit different, but I have

73         # The API handles its own authentication via keys
74         # Check for API key before checking for header
75         if path_info.startswith( '/api/' ):
76             return self.app( environ, start_response )

Edit: Before 16.04 it was on 15.10.

hexylena commented 6 years ago

@dannon cc'ing you here because I've never understood the seemingly magic authentication that happens when remote_users access the API, sans-api key. Do you remember anything around this time?

@sarah-peter is it possible that you could upgrade to a more recent release? 16.04 is no longer covered for security patches. 16.10 or 17.01 would be even better.

sarah-peter commented 6 years ago

At the moment I'm not planning to update our Galaxy. It's barely used and often there are issues after updates, so I don't see why I should invest that time.

If you can't locate the source of the problem anymore, feel free to close this issue and I will open a new one if I encounter the issue again in an up-to-date version.

dannon commented 6 years ago

@erasche I don't remember any specific, relevant remote user changes from around this time.

Since it happened with the introduction of the new libraries (and only there, in this particular case?), I'd look there and see how that POST in the log was being made. Then I'd compare that maybe with how the the toolform posts are made.

First in 17.01, https://github.com/galaxyproject/galaxy/pull/3976 may be a fix for this since I reworked the session handling logic a little there, and it'll definitely log more information as you mention before. It also handled popping garbage headers out which would potentially break some requests without an obvious cause.