IQSS / dataverse

Open source research data repository software
http://dataverse.org
Other
875 stars 484 forks source link

Optimize Permissions Lookup #2122

Closed astrofrog closed 5 years ago

astrofrog commented 9 years ago

I tried accessing:

https://dataverse.harvard.edu/api/dataverses/:root/contents?key=...

but this hangs. If I replace :root by say cfa, then it works fine.

pdurbin commented 9 years ago

"I wonder if querying the root is an expensive operation," I opined at http://irclog.iq.harvard.edu/dataverse/2015-04-30#i_19145

This is potentially yet another expensive tree traversal problem. See also #752.

pdurbin commented 9 years ago

This performance problem with the "contents" API endpoint was also mentioned at https://github.com/IQSS/dataverse/issues/1837#issuecomment-102085381

garthg commented 9 years ago

Hi,

@pdurbin Thanks for connecting this issue to the other trackers. This bug is a blocker for my project, as it impacts API-driven dataset editing and dataset creation. The non-returning call to get_contents() is very hard to avoid due to the combination of: 1) requiring database IDs instead of working with only with DOIs (tracked in issue #1837), AND 2) the inability to search for non-published datasets (tracked in issue #1299).

Garth

pdurbin commented 8 years ago

@michbarsinai and I took a look at this today and I captured some profiling data, which I uploaded to Google Drive. curl http://localhost:8080/api/dataverses/:root/contents?key=$API_TOKEN ran for about 40 minutes but I'm not sure what the output ultimately was since I was piping the output to jq and it wasn't valid JSON.

It looks like time is spent especially in these methods:

screen shot 2015-12-17 at 10 42 47 am

pdurbin commented 8 years ago

@michbarsinai you poked at the code a bit the other day, I believe. Perhaps you could comment on any findings.

michbarsinai commented 8 years ago

Seems to have something do to with group resolution. This one is the next on my list, so hopefully we'll see some advances here soon.

pdurbin commented 8 years ago

@scolapasta has been working on groups stuff in #2978 but I'm not sure it's related to group resolution or not.

kcondon commented 8 years ago

The root cause of this may be #752, tree traversal, however there is a related issue, #1160 . The effort will be to investigate root cause.

mheppler commented 7 years ago

@yarikoptic commented on Apr 4 on issue #3058

$> wget  'https://dataverse.harvard.edu/api/dataverses/icpsr/contents?key=SENSORED' 
--2016-04-04 09:32:48--  https://dataverse.harvard.edu/api/dataverses/icpsr/contents?key=c87766ca-e34c-4e9c-a23e-be277265074c
Resolving dataverse.harvard.edu (dataverse.harvard.edu)... 128.103.69.227
Connecting to dataverse.harvard.edu (dataverse.harvard.edu)|128.103.69.227|:443... connected.
HTTP request sent, awaiting response... 500 Internal Server Error
2016-04-04 09:33:49 ERROR 500: Internal Server Error.
pdurbin commented 7 years ago

Is anyone who is following this performance issue still suffering from it?

yarikoptic commented 7 years ago

did API change? I got the API token from https://dataverse.harvard.edu/dataverseuser.xhtml?selectTab=apiTokenTab and tried to do as I did previously:

$> wget "https://dataverse.harvard.edu/api/dataverses/icpsr/contents?key=$DATAVERSE_API"
--2017-06-23 10:14:17--  https://dataverse.harvard.edu/api/dataverses/icpsr/contents?key=7e8.......................
Resolving dataverse.harvard.edu (dataverse.harvard.edu)... 128.103.69.227
Connecting to dataverse.harvard.edu (dataverse.harvard.edu)|128.103.69.227|:443... connected.
HTTP request sent, awaiting response... 401 Unauthorized

Username/Password Authentication Failed.
garthg commented 7 years ago

Hi,

I still had this problem as of last week. Should I re-test now?

Thanks,

Garth

On Fri, Jun 23, 2017 at 10:16 AM, Yaroslav Halchenko < notifications@github.com> wrote:

did API change? I got the API token from https://dataverse.harvard.edu/ dataverseuser.xhtml?selectTab=apiTokenTab and tried to do as I did previously:

$> wget "https://dataverse.harvard.edu/api/dataverses/icpsr/contents?key=$DATAVERSE_API" --2017-06-23 10:14:17-- https://dataverse.harvard.edu/api/dataverses/icpsr/contents?key=7e8....................... Resolving dataverse.harvard.edu (dataverse.harvard.edu)... 128.103.69.227 Connecting to dataverse.harvard.edu (dataverse.harvard.edu)|128.103.69.227|:443... connected. HTTP request sent, awaiting response... 401 Unauthorized

Username/Password Authentication Failed.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/IQSS/dataverse/issues/2122#issuecomment-310676915, or mute the thread https://github.com/notifications/unsubscribe-auth/ABSsKxqrlupDQnBz7pauPik5AZgDbX3kks5sG8ingaJpZM4EMyl_ .

pdurbin commented 7 years ago

@yarikoptic @garthg no the API didn't change. Sorry to get your hopes up! I was just sweeping through old issues the other day trying to gauge interest. Sounds like you're both still interested in a fix. Thanks!

SamiSousa commented 6 years ago

I'm still interested in this issue. Ultimately displaying the contents of the root dataverse can be avoided by using the search API to find subtree dataverses, and using the native API from there (assuming there aren't datasets listed directly under the root dataverse). However, this doesn't solve the issue for other large dataverses beneath root...

djbrooke commented 6 years ago

Moving to backlog column to get an estimate on this during our next backlog grooming discussion. The performance issues here are affecting some integrations.

pdurbin commented 6 years ago

The performance issues here are affecting some integrations.

I just tested OSF and while I can see datasets in my "Open Source at Harvard" dataverse at https://dataverse.harvard.edu/dataverse/open-source-at-harvard just fine ....

screen shot 2018-05-25 at 8 42 41 am

... I can't see any datasets in the "root" dataverse, which is called "Harvard Dataverse" and see an error saying "Could not load datasets":

screen shot 2018-05-25 at 8 44 27 am

Mind you, I don't actually have any datasets in the root dataverse of Harvard Dataverse, so I'm not affected. @shlake are any of your users affected by this bug?

Yesterday @djbrooke forwarded to me a conversation with @sloria (hi!) and I'm trying to get a sense of the extent of the problem.

shlake commented 6 years ago

This used to work. Yes, in the list you would only see datasets in main Harvard dataverse (top level) that you have curator (and another, I think) permissions too. Very well could be a timing issues that OSF can't sort through & test permissions with thousands of datasets. UVA has less than 100 so timing hasn't effected us.

This whole display needs MAJOR work & OSF didn't have the capacity to look at it, & I didn't have time to pursue and the project manager at OSF had a baby last Fall. So nothing has been upgraded in a year or two.

I can look at my. Otes next week when I get back in the office.

shlake commented 6 years ago

@pdurbin here's some info in my Dataverse Presentation last year https://osf.io/7w8rz/

sloria commented 6 years ago

There's more detail in my email thread with @djbrooke , but this is the method that is timing out: https://github.com/IQSS/dataverse-client-python/blob/66a01e216812b338c9502911a2d022223b9094a7/dataverse/dataverse.py#L137

In addition, this request to get a Dataverse's contents hangs then eventually returns a 500 response (which may be a separate issue from the slowness?): https://gist.github.com/sloria/409d6b5d547aa65d9d01dcc67aceec36

pdurbin commented 6 years ago

@oscardssmith and I discussed this issue this morning and we agree that it will be important for a developer to be able to load up a heavily nested hierarchy of dataverses in order to see the slowness in a dev environment. I mentioned that Raman wrote a Python script to do this that I mentioned over at https://github.com/IQSS/dataverse/issues/752#issuecomment-64442551 . I'm attaching it and the input file here (with ".txt" appended so GitHub Issues will take it) but please note that it uses Selenium and we might want to simply it to just hit the API directly.

This is how the tree of dataverses looks:

tree

Based on old comments in this issue it looks like I suspected slowness in the permission system and I don't believe this has been addressed yet. It would be a good place to start digging, I think. There's also the concept of Modified Preorder Tree Traversal (MPTT) that may or may not be worth investigating. I wrote about it at https://github.com/IQSS/dataverse/issues/752#issuecomment-64132165

oscardssmith commented 6 years ago

My initial reaction is that MPTT is a really bad fit for our structure. It has the significant weakness that creating or deleting new objects in the tree requires remaking a significant portion of the structure. Instead, I think a better approach would be to use multiple rows per dataverse to list all parents. This has the advantage of never introducing large changes, while still removing the need for recursion.

michbarsinai commented 6 years ago

That's exactly why we abandoned it last time. I agree that we need to start seeking non-normalize solutions. One way could be what @oscardssmith offers here. We could also use solr (since it's part of the infrastructure already). The extreme solutions could be using graph databases (e.g. Neo4J) or use recursive SQL (e.g. https://www.postgresql.org/docs/8.4/static/queries-with.html).

On 9 Jul 2018, at 19:09, oscardssmith notifications@github.com wrote:

My initial reaction is that MPTT is a really bad fit for our structure. It has the significant weakness that creating or deleting new objects in the tree requires remaking a significant portion of the structure. Instead, I think a better approach would be to use multiple rows per dataverse to list all parents. This has the advantage of never introducing large changes, while still removing the need for recursion.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/IQSS/dataverse/issues/2122#issuecomment-403532806, or mute the thread https://github.com/notifications/unsubscribe-auth/AB2UJOAhNClzXJ8e4LEn5qNuxeP6QWEGks5uE4AdgaJpZM4EMyl_.

oscardssmith commented 6 years ago

grr. The script isn't running. I can't get its imports working.

scolapasta commented 6 years ago

The issue doesn't relate to the nested hierarchy as this endpoint doesn't delve into the tree structure (please correct me if that's wrong). The issue, as far as we understood it the last time, is with doing individual performance checks on the many dv objects* at one level.

As @michbarsinai suggests one way is to have this go through solr, as we preindex permissions there. (in fact the current work around for this is to use the search endpoints instead of this one). The biggest concern with this if we wanted to have a way of querying without using solr. At this point, I'd say it'd be reasonable to solve it this way so that we have a working solution.

Brainstorming some other approaches:

oscardssmith commented 6 years ago

@scolapasta If I am reading the code correctly, this at one point calls permissionsForSingleRoleAssignee which does a recursive tree traversal.

scolapasta commented 6 years ago

@oscardssmith I think that's checking the user/group hierarchy, not the dvObject tree.. (again, I may be misremembering - feel free to stop by and we can look at the code)

shlake commented 6 years ago

Question: How is "My Data" displayed? That seems to come up fast. Or is it because that's coded in JAVA and not via API (SWORD, etc)? With the OSF example, I would only expect to see the datasets that I have permissions for, which show up in "My Data" right?

scolapasta commented 6 years ago

My Data uses Solr (similar to the dataverse page).

michbarsinai commented 6 years ago

Another thing we should do anyway is test for required permissions for a given command, rather than getting all the permissions for a user on a given DvObject, as we do now. That is:

Today:

1. Get all the permissions the user has on a DvObject (recursive climb in DvObjcet containment hierarchy x (groups + user) )
2. Get the permissions required for the command (method call)
3. Check that (1) contains (2)

Optimized:

1. Get the permissions required for the command (method call)
2. Creep up the DvContainment hierarchy until all permissions in (1) are found, or until we hit a permission root.

Today's code was easy to write it's easy to see that it's correct. But it collects more permissions than necessary. Since for most cases only a single permission is required, we might be able to cut down a lot of SQL and JPA code. We could also devise more efficient JPA queries to get all the permissions for the user and the relevant groups in a single query.

On 9 Jul 2018, at 22:28, Gustavo Durand notifications@github.com wrote:

My Data uses Solr (similar to the dataverse page).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/IQSS/dataverse/issues/2122#issuecomment-403593268, or mute the thread https://github.com/notifications/unsubscribe-auth/AB2UJCfbUHyd5ZbKJHFpMdpVpe4tsTlSks5uE67BgaJpZM4EMyl_.

oscardssmith commented 6 years ago

Question: what is the minimal set of things needed to reproduce this issue? I've created 5000 dataverses, and the command is quite quick.

pdurbin commented 6 years ago

@oscardssmith I believe that the only place where we've heard of this performance problem is https://dataverse.harvard.edu . Anyone can create an account there so you would be welcome to create one and make sure that the problem can still be reproduced. Can you please attach your script to create all those dataverses? I wonder if we need to also create datasets (and maybe files) and also make a large number of role assignments. If necessary, we can get you a dump of the production database but first I'd double check that you're still seeing the performance problem in production, as I suspect you will.

oscardssmith commented 6 years ago
import json
import requests

def make_nested(i=1):
    if i > MAX:
        return
    print(i)
    dataverse_json = '''{
    "name": "Scientific Research",
    "alias": "nested%d",
    "dataverseContacts": [{"contactEmail": "pi@example.edu"}]}''' %i

    if i == 1:
        url = f'{dataverse_server}/api/dataverses/root?key={api_key}'
    else:
        url = f'{dataverse_server}/api/dataverses/nested{i//2}?key={api_key}'
    requests.post(url, data=dataverse_json)
    requests.post(f'{dataverse_server}/api/dataverses/nested{i}/actions/:publish?key={api_key}')
    make(2*i)
    make(2*i+1)

def make_flat():
    for i in range(MAX+1):
        print(i)
        dataverse_json = '''{
        "name": "Scientific Research",
        "alias": "flat%d",
        "dataverseContacts": [{"contactEmail": "pi@example.edu"}]}''' %i

        url = f'{dataverse_server}/api/dataverses/root?key={api_key}'
        requests.post(url, data=dataverse_json)
        requests.post(f'{dataverse_server}/api/dataverses/flat{i}/actions/:publish?key={api_key}')

dataverse_server = 'http://localhost:8080' # no trailing slash
api_key = 'aba1b874-b86e-44fd-a992-ced767ece854' #root
api_key = 'dba4246f-4aec-4579-9c22-8f09b438264b' #user
MAX = 5000
#make_nested()
#make_flat()

r = requests.get(f'{dataverse_server}/api/dataverses/:root/contents?key={api_key}')
print(r.json())

make_nested creates a series of binary nested dataverses, make_flat makes a bunch of dataverses at the same level.

oscardssmith commented 6 years ago

Just confirming, this issue is reproducible on harvard dataverse. I think that to move forward on this, I need some place that has enough data, that I can profile and make code changes to.

pdurbin commented 6 years ago

@oscardssmith ok, @jggautier and @dlmurphy have access to a copy of production data for #4169 and can probably help you get a copy. That copy is updated ever once in a while by @kcondon but I'm sure that the copy will be new enough to exercise the bug.

michbarsinai commented 6 years ago

The performance issues might have more to do with groups and containment hierarchy depth than with Dataverse count.

Sent from my iPhone

On 10 Jul 2018, at 16:56, Philip Durbin notifications@github.com wrote:

@oscardssmith ok, @jggautier and @dlmurphy have access to a copy of production data for #4169 and can probably help you get a copy. That copy is updated ever once in a while by @kcondon but I'm sure that the copy will be new enough to exercise the bug.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

oscardssmith commented 6 years ago

I'm most of the way through installing the harvard dataverse (minus files) to my system, which should let me test this finally.

pdurbin commented 6 years ago

@oscardssmith the other day I was telling you about a nice visualization in D3 that @raprasad made that shows a hierarchy of dataverse. He and I just looked and couldn't find it but we did find the JSON output that the visualization was created from at "Full Dataverse hierarchy (pretty)" linked from https://services.dataverse.harvard.edu/miniverse/metrics/metrics-links

Here's a few lines of how https://services.dataverse.harvard.edu/miniverse/metrics/dv-tree-full.json?pretty looks:

screen shot 2018-07-11 at 12 48 11 pm

Here's the code if it's of interest: https://github.com/IQSS/miniverse/blob/864cb9d75c45933cb74f67aa1eada5f2e13c0c13/dv_apps/metrics/urls.py#L79

oscardssmith commented 6 years ago

At this point, I've added fastpaths for superusers and non logged in users that both only take ~2 seconds to complete. Now to get Authenticated Users equally fast.

oscardssmith commented 6 years ago

I have an implementation with Solr working, which I am now going to ditch because it takes about as long as my previous attempt (25 min) just for solr to do it's part. At this point we have 3 options to go forward

  1. Write lots of custom queries to brute force this
  2. Cache group membership
  3. Only allow users to lookup some number ~50 at a time.
landreev commented 6 years ago

I've been reviewing this, and I don't see any problems with the code. But, given the complexity of this refactoring, do you think this needs to be tested in some comprehensive way, before we decide that it's done? I'm ok if this testing part is done during QA. But I feel it may still be up to the developers, to come up with such comprehensive tests. Again, considering the complexity.

Since this started as an issue with the specific dataverse-listing API, maybe a good test for this is to script running the API on the entire production database, recursively; and compare the "before" and "after", and confirm that the results are identical? (This of course could take a while, seeing how it takes forever in the "before" state).

Any better ideas?

kcondon commented 6 years ago

@michbarsinai we cannot deploy this, any ideas?

[2018-08-31T17:17:45.982-0400] [glassfish 4.1] [SEVERE] [NCLS-CORE-00026] [javax.enterprise.system.core] [tid: _ThreadID =47 _ThreadName=admin-listener(4)] [timeMillis: 1535750265982] [levelValue: 1000] [[ Exception during lifecycle processing java.lang.RuntimeException: java.lang.ClassNotFoundException: org.hibernate.validator.internal.cdi.interceptor.Validatio nInterceptor at com.sun.ejb.containers.BaseContainer.setStartedState(BaseContainer.java:962) at org.glassfish.ejb.startup.EjbApplication.markAllContainersAsStarted(EjbApplication.java:140) at org.glassfish.ejb.startup.EjbApplication.start(EjbApplication.java:152) at org.glassfish.internal.data.EngineRef.start(EngineRef.java:122) at org.glassfish.internal.data.ModuleInfo.start(ModuleInfo.java:291) at org.glassfish.internal.data.ApplicationInfo.start(ApplicationInfo.java:352) at com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:500) at com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:219) at org.glassfish.deployment.admin.DeployCommand.execute(DeployCommand.java:491) at com.sun.enterprise.v3.admin.CommandRunnerImpl$2$1.run(CommandRunnerImpl.java:539) at com.sun.enterprise.v3.admin.CommandRunnerImpl$2$1.run(CommandRunnerImpl.java:535) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at com.sun.enterprise.v3.admin.CommandRunnerImpl$2.execute(CommandRunnerImpl.java:534) at com.sun.enterprise.v3.admin.CommandRunnerImpl$3.run(CommandRunnerImpl.java:565) at com.sun.enterprise.v3.admin.CommandRunnerImpl$3.run(CommandRunnerImpl.java:557) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at com.sun.enterprise.v3.admin.CommandRunnerImpl.doCommand(CommandRunnerImpl.java:556) at com.sun.enterprise.v3.admin.CommandRunnerImpl.doCommand(CommandRunnerImpl.java:1464) at com.sun.enterprise.v3.admin.CommandRunnerImpl.access$1300(CommandRunnerImpl.java:109) at com.sun.enterprise.v3.admin.CommandRunnerImpl$ExecutionContext.execute(CommandRunnerImpl.java:1846) at com.sun.enterprise.v3.admin.CommandRunnerImpl$ExecutionContext.execute(CommandRunnerImpl.java:1722) at org.glassfish.admin.rest.resources.admin.CommandResource.executeCommand(CommandResource.java:404) at org.glassfish.admin.rest.resources.admin.CommandResource.execCommandSimpInMultOut(CommandResource.java:234) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) : . . . Caused by: java.lang.ClassNotFoundException: org.hibernate.validator.internal.cdi.interceptor.ValidationInterceptor at org.glassfish.web.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1783) at org.glassfish.web.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1633) at com.sun.ejb.containers.interceptors.InterceptorManager.buildEjbInterceptorChain(InterceptorManager.java:431) at com.sun.ejb.containers.interceptors.InterceptorManager.(InterceptorManager.java:131) at com.sun.ejb.containers.BaseContainer.initializeInterceptorManager(BaseContainer.java:3396) at com.sun.ejb.containers.BaseContainer.setStartedState(BaseContainer.java:950) ... 67 more ]]

michbarsinai commented 6 years ago

Interesting.... I don't get that one (on a newly installed laptop, using the installer script). I do get some other errors, though. Will report when it's solved (moving back to community dev now 😕)

On 1 Sep 2018, at 0:22, Kevin Condon notifications@github.com wrote:

@michbarsinai https://github.com/michbarsinai we cannot deploy this, any ideas?

[2018-08-31T17:17:45.982-0400] [glassfish 4.1] [SEVERE] [NCLS-CORE-00026] [javax.enterprise.system.core] [tid: _ThreadID =47 _ThreadName=admin-listener(4)] [timeMillis: 1535750265982] [levelValue: 1000] [[ Exception during lifecycle processing java.lang.RuntimeException: java.lang.ClassNotFoundException: org.hibernate.validator.internal.cdi.interceptor.Validatio nInterceptor at com.sun.ejb.containers.BaseContainer.setStartedState(BaseContainer.java:962) at org.glassfish.ejb.startup.EjbApplication.markAllContainersAsStarted(EjbApplication.java:140) at org.glassfish.ejb.startup.EjbApplication.start(EjbApplication.java:152) at org.glassfish.internal.data.EngineRef.start(EngineRef.java:122) at org.glassfish.internal.data.ModuleInfo.start(ModuleInfo.java:291) at org.glassfish.internal.data.ApplicationInfo.start(ApplicationInfo.java:352) at com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:500) at com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:219) at org.glassfish.deployment.admin.DeployCommand.execute(DeployCommand.java:491) at com.sun.enterprise.v3.admin.CommandRunnerImpl$2$1.run(CommandRunnerImpl.java:539) at com.sun.enterprise.v3.admin.CommandRunnerImpl$2$1.run(CommandRunnerImpl.java:535) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at com.sun.enterprise.v3.admin.CommandRunnerImpl$2.execute(CommandRunnerImpl.java:534) at com.sun.enterprise.v3.admin.CommandRunnerImpl$3.run(CommandRunnerImpl.java:565) at com.sun.enterprise.v3.admin.CommandRunnerImpl$3.run(CommandRunnerImpl.java:557) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at com.sun.enterprise.v3.admin.CommandRunnerImpl.doCommand(CommandRunnerImpl.java:556) at com.sun.enterprise.v3.admin.CommandRunnerImpl.doCommand(CommandRunnerImpl.java:1464) at com.sun.enterprise.v3.admin.CommandRunnerImpl.access$1300(CommandRunnerImpl.java:109) at com.sun.enterprise.v3.admin.CommandRunnerImpl$ExecutionContext.execute(CommandRunnerImpl.java:1846) at com.sun.enterprise.v3.admin.CommandRunnerImpl$ExecutionContext.execute(CommandRunnerImpl.java:1722) at org.glassfish.admin.rest.resources.admin.CommandResource.executeCommand(CommandResource.java:404) at org.glassfish.admin.rest.resources.admin.CommandResource.execCommandSimpInMultOut(CommandResource.java:234) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) : . . . Caused by: java.lang.ClassNotFoundException: org.hibernate.validator.internal.cdi.interceptor.ValidationInterceptor at org.glassfish.web.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1783) at org.glassfish.web.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1633) at com.sun.ejb.containers.interceptors.InterceptorManager.buildEjbInterceptorChain(InterceptorManager.java:431) at com.sun.ejb.containers.interceptors.InterceptorManager.(InterceptorManager.java:131) at com.sun.ejb.containers.BaseContainer.initializeInterceptorManager(BaseContainer.java:3396) at com.sun.ejb.containers.BaseContainer.setStartedState(BaseContainer.java:950) ... 67 more ]]

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/IQSS/dataverse/issues/2122#issuecomment-417792498, or mute the thread https://github.com/notifications/unsubscribe-auth/AB2UJPcnRFfT29D9_95x1AEk0kH5Hesfks5uWakIgaJpZM4EMyl_.

michbarsinai commented 6 years ago

Update - I've moved #4944 and its PR to QA. Since this issue is included in that one, you might want to QA only PR #4998. At any event, I didn't get the missing class exception on either issues.

On 3 Sep 2018, at 0:00, Michael Bar-Sinai mich.barsinai@gmail.com wrote:

Interesting.... I don't get that one (on a newly installed laptop, using the installer script). I do get some other errors, though. Will report when it's solved (moving back to community dev now 😕)

On 1 Sep 2018, at 0:22, Kevin Condon <notifications@github.com mailto:notifications@github.com> wrote:

@michbarsinai https://github.com/michbarsinai we cannot deploy this, any ideas?

[2018-08-31T17:17:45.982-0400] [glassfish 4.1] [SEVERE] [NCLS-CORE-00026] [javax.enterprise.system.core] [tid: _ThreadID =47 _ThreadName=admin-listener(4)] [timeMillis: 1535750265982] [levelValue: 1000] [[ Exception during lifecycle processing java.lang.RuntimeException: java.lang.ClassNotFoundException: org.hibernate.validator.internal.cdi.interceptor.Validatio nInterceptor at com.sun.ejb.containers.BaseContainer.setStartedState(BaseContainer.java:962) at org.glassfish.ejb.startup.EjbApplication.markAllContainersAsStarted(EjbApplication.java:140) at org.glassfish.ejb.startup.EjbApplication.start(EjbApplication.java:152) at org.glassfish.internal.data.EngineRef.start(EngineRef.java:122) at org.glassfish.internal.data.ModuleInfo.start(ModuleInfo.java:291) at org.glassfish.internal.data.ApplicationInfo.start(ApplicationInfo.java:352) at com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:500) at com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:219) at org.glassfish.deployment.admin.DeployCommand.execute(DeployCommand.java:491) at com.sun.enterprise.v3.admin.CommandRunnerImpl$2$1.run(CommandRunnerImpl.java:539) at com.sun.enterprise.v3.admin.CommandRunnerImpl$2$1.run(CommandRunnerImpl.java:535) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at com.sun.enterprise.v3.admin.CommandRunnerImpl$2.execute(CommandRunnerImpl.java:534) at com.sun.enterprise.v3.admin.CommandRunnerImpl$3.run(CommandRunnerImpl.java:565) at com.sun.enterprise.v3.admin.CommandRunnerImpl$3.run(CommandRunnerImpl.java:557) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at com.sun.enterprise.v3.admin.CommandRunnerImpl.doCommand(CommandRunnerImpl.java:556) at com.sun.enterprise.v3.admin.CommandRunnerImpl.doCommand(CommandRunnerImpl.java:1464) at com.sun.enterprise.v3.admin.CommandRunnerImpl.access$1300(CommandRunnerImpl.java:109) at com.sun.enterprise.v3.admin.CommandRunnerImpl$ExecutionContext.execute(CommandRunnerImpl.java:1846) at com.sun.enterprise.v3.admin.CommandRunnerImpl$ExecutionContext.execute(CommandRunnerImpl.java:1722) at org.glassfish.admin.rest.resources.admin.CommandResource.executeCommand(CommandResource.java:404) at org.glassfish.admin.rest.resources.admin.CommandResource.execCommandSimpInMultOut(CommandResource.java:234) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) : . . . Caused by: java.lang.ClassNotFoundException: org.hibernate.validator.internal.cdi.interceptor.ValidationInterceptor at org.glassfish.web.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1783) at org.glassfish.web.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1633) at com.sun.ejb.containers.interceptors.InterceptorManager.buildEjbInterceptorChain(InterceptorManager.java:431) at com.sun.ejb.containers.interceptors.InterceptorManager.(InterceptorManager.java:131) at com.sun.ejb.containers.BaseContainer.initializeInterceptorManager(BaseContainer.java:3396) at com.sun.ejb.containers.BaseContainer.setStartedState(BaseContainer.java:950) ... 67 more ]]

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/IQSS/dataverse/issues/2122#issuecomment-417792498, or mute the thread https://github.com/notifications/unsubscribe-auth/AB2UJPcnRFfT29D9_95x1AEk0kH5Hesfks5uWakIgaJpZM4EMyl_.

landreev commented 6 years ago

Hmm. I haven't been able to reproduce it on my dev. system either. My glassfish setup is identical (4.1 build 13). The jar where the validator class lives (glassfish/modules/bean-validator-cdi.jar) is the same. dataverse-internal (vm4) may have a somewhat newer version of jdk 1.8... it could be conceivable that something in that newer jdk conflicts with the (relatively old) jars shipped with GF 4.1... but then it's only bombing with this branch, and not with others...

@kcondon has just suggested that it could be something with how the war file is built on build.hmdc specifically - I'm going to take that and try to run it on my system now.

michbarsinai commented 6 years ago

Maybe a messed up maven cache on that machine?

On 4 Sep 2018, at 23:55, landreev notifications@github.com wrote:

Hmm. I haven't been able to reproduce it on my dev. system either. My glassfish setup is identical (4.1 build 13). The jar where the validator class lives (glassfish/modules/bean-validator-cdi.jar) is the same. dataverse-internal (vm4) may have a somewhat newer version of jdk 1.8... it could be conceivable that something in that newer jdk conflicts with the (relatively old) jars shipped with GF 4.1... but then it's only bombing with this branch, and not with others...

@kcondon https://github.com/kcondon has just suggested that it could be something with how the war file is built on build.hmdc specifically - I'm going to take that and try to run it on my system now.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/IQSS/dataverse/issues/2122#issuecomment-418514760, or mute the thread https://github.com/notifications/unsubscribe-auth/AB2UJEU8aCGv6f6O0MPn8-zXz20nbFU2ks5uXujEgaJpZM4EMyl_.

landreev commented 6 years ago

No, that .war file also deploys just fine on my own Mac OS system. I'm kind of out of ideas for now...

landreev commented 6 years ago

FWIW, we were reading this thread with somebody reporting this same "class not found" exception: https://stackoverflow.com/questions/28719090/java-lang-classnotfoundexception-org-hibernate-validator-internal-cdi-intercept. One thing the victim mentions there is: "...It was a @javax.validation.constraints.NotNull annotation on a method parameter in a @Stateless bean..." and I happened to notice that @michbarsinai did add a couple of NotNull checks to a stateless bean in that branch (DataverseRoleServiceBean). And this was something we hadn't done before (as in, we only used @NotNull in entities before). So I tried and commented out these NotNulls from the bean, built a war file - and it is now deploying on dvn-vm4. I still have zero ideas as to why this was only a problem on vm4 (and dvn-build - so, presumably, on all of our older RH 6 boxes). Please advise.

michbarsinai commented 6 years ago

I've added them as part of the code review comments. I'm happy to keep the version with the @NotNull comments deleted.

In the long run, we might need to update the validation and JPA implementation anyway, since AFAIK they don't support lambdas etc too. At this point. Out of scope for this issue, though.

On 5 Sep 2018, at 0:37, landreev notifications@github.com wrote:

FWIW, we were reading this thread with somebody reporting this same "class not found" exception: https://stackoverflow.com/questions/28719090/java-lang-classnotfoundexception-org-hibernate-validator-internal-cdi-intercept https://stackoverflow.com/questions/28719090/java-lang-classnotfoundexception-org-hibernate-validator-internal-cdi-intercept. One thing the victim mentions there is: "...It was a @javax.validation.constraints.NotNull annotation on a method parameter in a @stateless https://github.com/stateless bean..." and I happened to notice that @michbarsinai https://github.com/michbarsinai did add a couple of NotNull checks to a stateless bean in that branch (DataverseRoleServiceBean). And this was something we hadn't done before (as in, we only used @NotNull https://github.com/NotNull in entities before). So I tried and commented out these NotNulls from the bean, built a war file - and it is now deploying on dvn-vm4. I still have zero idea why this was only a problem on vm4 (and dvn-build - so, presumably, on all of our older RH 6 boxes). Please advise.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/IQSS/dataverse/issues/2122#issuecomment-418526387, or mute the thread https://github.com/notifications/unsubscribe-auth/AB2UJL6fiuzcCkCqtozedLpTcAE0swxOks5uXvKSgaJpZM4EMyl_.

landreev commented 6 years ago

I can check in my version with these annotations commented out. But, would you need to add some alternative logic to check for nulls there?

michbarsinai commented 6 years ago

No. I left them in and had to explain why (reason: they only create warnings, not complain errors). It's just added documentation, that's all.

On 5 Sep 2018, at 0:59, landreev notifications@github.com wrote:

I can check in my version with these annotations commented out. But, would you need to add some alternative logic to check for nulls there?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/IQSS/dataverse/issues/2122#issuecomment-418531943, or mute the thread https://github.com/notifications/unsubscribe-auth/AB2UJL5zbyQFroKhY2gzjJT8OBns_TvSks5uXvfWgaJpZM4EMyl_.