IQSS / dataverse

Open source research data repository software
http://dataverse.org
Other
878 stars 490 forks source link

Upgrade from Glassfish 4.1 #6230

Closed scolapasta closed 4 years ago

scolapasta commented 5 years ago

Dataverse is currently running on a (modified) Glassfish 4.1. This has served us well, but is old and no longer actively supported. Especially in light of the upcoming DataTags integration, we want to upgrade to a new app server for security uogrades, as well as being able to leverage newer features (e.g. Config API) and enable upgrades of other core technology (e.g. Java EE 7 -> Jakarta EE 8).

There are a few options to explore. Some work has already been done for Payara 5, and this tickets will serve for both for the evaluation of that and some other candidates (TomEE, Glassfish 5, Wildfly), and the actual upgrade.

Note that I'm closing this old issue #2628 with a very similar title, as that is several years old now and was originally about upgrading to 4.1.1.

Let's use this issue going forward for comments related to upgrading from glassfish 4.1.

(per request, also adding a link to the GH project that references subtasks related to this upgrade: https://github.com/orgs/IQSS/projects/16)

poikilotherm commented 5 years ago

@scolapasta may I ask you to link to the GH project? We can rename it to be more technology neutral if it is too Payaraish (too much badass 🐟).

Looking forward to the next community call on this. :smile:

djbrooke commented 5 years ago

Thanks for creating, @scolapasta! I'll put a "Large" on this (any objections? :)) and move it to the team dev column so that we can regularly discuss progress in standup.

pdurbin commented 5 years ago

I'm excited to see the following candidates on the list above:

I think we should also consider one more:

My reasoning for including Open Liberty is twofold.

First, if you go to https://jakarta.ee and follow the links from the announcement of Jakarta EE 8 to the list of compatible products, Open Liberty is a compatible implementation (Payara is not yet listed but I'm sure will be).

Screen Shot 2019-09-30 at 4 35 25 PM Screen Shot 2019-09-30 at 4 35 31 PM Screen Shot 2019-09-30 at 4 35 49 PM

Second, the "fast-deployment maven plugin" mentioned in airhacks 47 and lots of other Open Liberty goodies mentioned in airhacks 32 also sounded awesome: http://airhacks.fm

Finally, as is my wont, I made a spreadsheet so we can start organizing our thoughts on the various candidates: https://docs.google.com/spreadsheets/d/1TVVErG_zFC6k4Fnjz8rJ9VJxMD_xGJSk7OwVjQOLXaQ/edit?usp=sharing

scolapasta commented 5 years ago

Progress on this has been slow to start, mostly because I'm waiting for a new laptop (it didn't make sense to install and work through dev environment issues on my current, old (and very limited HD space), when I'm getting a new one soon.

Pre laptop steps are evaluating the current state of the app servers and make sure they support the things we need (Jakarta EE support (though we won't do that immediately, if we don't have to), Config API, etc).

Once new laptop has arrived, I'll first experiment with Payara (or possibly Glassfish) as those currently seem like they will be the easiest to upgrade to.

poikilotherm commented 4 years ago

Hey @scolapasta any news on this? @pdurbin is already pushing people to try things πŸ˜„, see #5907... :wink: Just kidding, no need to rush this.

A comment on appservers: if we are going to open the discussion on what to use, I'd suggest looking into Payara and Wildfly.

Payara because it's very likely to get going fast due to the relationship between Glassfish and badass 🐟.

Wildfly because it might be a good option to have a go on Quarkus. Someday. Far future. But cool stuff.

scolapasta commented 4 years ago

OK, here's an update on this and a plan for next steps:

In determining which app server to concentrate our efforts on, I started first with Glassfish 5 and Payara 5. For each of these, getting the app to deploy was fairly straightforward - download, unzip, replace the domains directory with the one from glassfish 4, copy over the postgres driver - and voila - BUILD SUCCESSFUL!

While I haven't yet tried the other app servers, the process for those would be more complex. And I do believe the issues we're seeing are not specific to the app server itself, but to the standard library upgrades they are using. So it makes sense to continue down the Payara / Glassfish path while we solve these.

Unfortunately, a successful deployment is not in itself a success. Many things break at that point. I had hoped that their might be one comprehensive solution, and I do still think some of the issues are related, but at this point I think each problem needs to be worked on and solved individually.

So the plan going forward will be to create a branch for this upgrade and run it on an instance so we can keep testing and possibly finding new issues. Fork this branch and start solving the individual problems. (this would be a good area we could use some swarming, if devs are available)

Here are the current known issues (and as we find more, we can add them to this list): (EDIT: newly found issues should be added in new comment below, as we search for more issues after running Payara with JSF 2.3.14)

The following are all artifacts of the same underlying issue (#6463) and are solved by using the latest JSF (2.3.14); marking as solved, though we'll still need to decide if we wait for Payara to ship with JSF 2.3.14 or patch:

Fixes incoming / committed to branch (unchecked vs checked):

There was a branch by @poikilotherm (PR #6220) where he changed the dataverse page from storing the id in the id property of a dataverse object to a dedicated property of the page itself. While that allowed me to click save, the changes I made were not stored. (this still could be how to get this to work, with possibly some more work). Doing a similar change to the dataset page seemed to have no effect.

As we solve each problem, we can see if the solution is helpful for any of the other problems.

scolapasta commented 4 years ago

Some further debugging: for the root dataverse issue, it does look like the solution from #6220 will be the way to go. (the save issue is clearly different, and similar to what we see with users). Fundamentally the issue is that when you load the page using dataverse.alias as the storage place for the view parameter forces bean validation and since this can't be null, the validation fails and the init method is not called (as @poikilotherm observed in #6216). That is why in addition to the base url working, just passing in the id no longer worked either (e.g. dataverse.xhtml?id=1)

In regards to the manage permissions issue, the error is a null pointer. The injected service bean is null. So next step is to understand why - is injection in converters no longer possible? I saw a similar issue in trying to create a dataset, so this figuring this out should solve a class of issues.

scolapasta commented 4 years ago

OK, this explains why it worked before and isn't working now: https://stackoverflow.com/questions/33078849/inject-doesnt-work-in-facesconverter-after-upgrade-glassfish-4-1-to-4-1-1

Working on fix next, hopefully as straightforward as it sounds...

poikilotherm commented 4 years ago

I am still volunteering for help on this issue.

It would be great to discuss as a community about some things. Like what we all see as wanted changes from different perspectives, how we proceed regarding breaking changes and what kind of server we are going to use.

Is it time to reach out to community gathering technical feedback?

scolapasta commented 4 years ago

Thanks @poikilotherm if you want to look at any of these problems, let me know (I'd love it if you or someone could start looking at the upload files issues).

We're still in the investigate how to solve problems stage. (as I pointed out, I don't believe these issues are due to Payara (or any other app server) specifically, but rather to using newer versions of standard libraries (for example, the validation issue you found, or the Converter issue due to changes in later versions of JSF).

Once we make some more progress on understanding / resolving these issues, we can figure out next steps.

scolapasta commented 4 years ago

Update on converters: Bad news / good news. The bad news is that it has not been straightforward to get the solution I prefer working (that is, using either JSF 2.3 syntax or Omnifaces). I'm likely doing something stupid, but trying what has been suggested has yet to work.

Good news is I have gotten one of the other suggestions to work, namely, remove the @Inject and replace by manually grabbing the bean via CDI utility class: RoleAssigneeServiceBean roleAssigneeService = CDI.current().select(RoleAssigneeServiceBean.class).get();

This worked and I can now add role assignments. I am making this change on all the other Converters (there are 10) to see if that helps fix or make progress on some of the other issues.

I may still spend some time trying to get the other solution working, but if not, we have this method.

pdurbin commented 4 years ago

@scolapasta great! The "you were relying on an unspecified/undocumented feature" comment from BalusC in the Stack Overflow post is exactly what I was trying to say in tech hours this week, though I probably framed it more as using a deprecated feature. I have a lot more hope now! πŸŽ‰

scolapasta commented 4 years ago

Some news on upload: I was able to upload a file. However, it didn't work "the first time". i.e. I upload the file, it runs through the upload / progress, disappears, but then isn't on the list. If I upload a 2nd file, it then does add the first file to the list (true for adding nth+1 file, adds nth file to the list).

I've noticed in other places that clicking on some buttons operate differently the first time, and when you click a 2nd time. Not yet sure if this one is related to those.

Other updates: I was able to create a dataset. However, as with dataverse, if there is invalid data it does not work. Also, publish

And lastly for now, File edit. After I uploaded the file, I was able to edit the metadata from the file page, but not if I went from the dataset page. The only differences I see are two extra parameters, mode=SINGLE, version=DRAFT. I added mode=SINGLE to the URL and it worked, so for some reason missing that parameter is an issue.

Still, some progress. Next week, I'll create the branch and start checking in some of this.

scolapasta commented 4 years ago

OH, also, I had to change the dataset page to also not set "id" in the dataset, but to have a dedicated id on the DatasetPage (as we did with DataversePage). Before that I would get a "An instance of a null PK has been incorrectly provided for this find operation." error when clicking edit or when I tried to publish. (The id is getting reset to null from the viewParam; I guess they are not resent and if it's the one from the Dataset object, that is bad).

With the change, the error went away. On edit, I still can't (the page is not updating its view), but I was now able to publish.

djbrooke commented 4 years ago

Thanks @scolapasta for the list in https://github.com/IQSS/dataverse/issues/6230#issuecomment-561330022. As I mentioned in standup, it will be good to get some swarming on this during the sprint starting Wednesday. I'm thinking of running this sprint until 1/8, so we'll have plenty of time.

scolapasta commented 4 years ago

Branch created: 6230-glassfish-upgrade and initial commits (for fixes described above) pushed.

scolapasta commented 4 years ago

So, here's the general plan: for this next sprint we will (among some other issues) be swarming on this. In order to make it easier to manage, when a developer chooses to work on a particular problem, they should break it out into a separate issue (as I've done with #6447 and #6448) and assign themselves. It's ok for multiple people to try to investigate on the same problem, just make sure you're assigned so you can communicate effectively about progress. Once there's a proposed solution, make a PR to this branch (6230-glassfish-upgrade) and I can quickly review (I'd like to see things as they come in, in order to keep a comprehensive view of the types of solutions we are finding).

External developers are very welcome to also help; please just follow the procedure above so we can easily track where things are at.

One note: we prefer solutions to be generic, i.e not tied to any specific app server. In that way, while we will still likely only support one app server, others may experimentally choose to run Dataverse on other app servers. However, we are not as concerned at the time with backward compatibility to Glassfish 4. While it may be nice to have, we will eventually be using features that will definitely not work (i.e. JSF 2.3 features) and require upgrading to one of the modern app servers.

poikilotherm commented 4 years ago

Should we use https://github.com/orgs/IQSS/projects/16 for tracking/visualisation? Maybe the tag, too?

pdurbin commented 4 years ago

Should we use https://github.com/orgs/IQSS/projects/16 for tracking/visualisation?

@poikilotherm as promised, I just mentioned this at standup. My understanding is that you are quite welcome to use that board.

pdurbin commented 4 years ago

I guess I'll mention a quick experiment I tried today based on conversations at standup and what I just read at https://docs.payara.fish/v/5.194/documentation/payara-server/classloading.html

@scolapasta explained that we can figure out which JSF version (Mojarra) we are using with FacesContext.class.getPackage().getImplementationVersion();

In Glassfish 4.1 we are using 2.2.7.

In Payara 5.194 we are using 2.3.9.payara-p3.

I tried to keep us on JSF 2.2 like this...

[glassfish@ip-172-31-45-140 dataverse]$ git diff
diff --git a/pom.xml b/pom.xml
index 481f041..3dbe73c 100644
--- a/pom.xml
+++ b/pom.xml
@@ -599,6 +599,11 @@
             <artifactId>opennlp-tools</artifactId>
             <version>1.9.1</version>
         </dependency>
+        <dependency>
+          <groupId>org.glassfish</groupId>
+          <artifactId>javax.faces</artifactId>
+          <version>2.2.7</version>
+        </dependency>
     </dependencies>
     <build>
         <!--        <testResources>
diff --git a/src/main/webapp/WEB-INF/glassfish-web.xml b/src/main/webapp/WEB-INF/glassfish-web.xml
index ecd3ba1..14bbe56 100644
--- a/src/main/webapp/WEB-INF/glassfish-web.xml
+++ b/src/main/webapp/WEB-INF/glassfish-web.xml
@@ -13,4 +13,6 @@
   <property name="alternatedocroot_logos" value="from=/logos/* dir=./docroot"/>
   <property name="alternatedocroot_sitemap" value="from=/sitemap/* dir=./docroot"/>
   <parameter-encoding default-charset="UTF-8"/>
+  <class-loader delegate="false"/>
+  <property name="useBundledJsf" value="true" />
 </glassfish-web-app>

... but deployment failed with this error:

[#|2019-12-20T18:12:28.858+0000|SEVERE|Payara 5.194|javax.enterprise.system.core|_ThreadID=42;_ThreadName=admin-thread-pool::admin-listener(1);_TimeMillis=1576865548858;_LevelValue=1000;| Exception while loading the app : CDI deployment failure:WELD-001477: The bean Managed Bean [class edu.harvard.iq.dataverse.authorization.providers.oauth2.OAuth2LoginBackingBean] with qualifiers [@Default @Any @Named] declares a passivating scope but has a(n) Interceptor [class org.hibernate.validator.cdi.internal.interceptor.ValidationInterceptor intercepts @MethodValidated] with a non-passivation-capable dependency null -- WELD-001477: The bean Managed Bean [class edu.harvard.iq.dataverse.authorization.providers.oauth2.OAuth2LoginBackingBean] with qualifiers [@Default @Any @Named] declares a passivating scope but has a(n) Interceptor [class org.hibernate.validator.cdi.internal.interceptor.ValidationInterceptor intercepts @MethodValidated] with a non-passivation-capable dependency null|#]

I don't have a lot of confidence that I'm doing this right, by the way.

It's good to know that in the process of upgrading from Glassfish 4.1 to Payara 5 we are also upgrading from JSF 2.2 to 2.3. I looked around for an upgrade guide but couldn't find one. I did find a long-ish "What's new in JSF 2.3?" blog post at http://arjan-tijms.omnifaces.org/p/jsf-23.html linked from https://github.com/eclipse-ee4j/mojarra that might be worth reading.

scolapasta commented 4 years ago

Updates here (I also added to the comment above): patching Payara (and Glassfish 5) with JSF 2.3.14 solved the CDATA related issues. We'll need to decide whether we wait for an app server to ship with this or we patch it temporarily until then. But this unblocks the the functionality that was behind these ajax partial submits so we can test some more.

Remaining known issues:

β€’ metadatablocks selection on (edit) Dataverse page (not an issue) β€’ upload zip file never removes from progress list (not an issue)

scolapasta commented 4 years ago

Zip files upload! It turns out that the file I was trying (a random zip that didn't even have specific data files) was the problem; it also fails on demo. So I went to prod and downloaded several real files as a zip and tried that - worked as expected.

scolapasta commented 4 years ago

The metadatablock selection issues seen also do not seem to be related to the upgrade, as we see the same on demo (running glassfish 4.1.1). Separate issues are being opened for those:

6516

6518

scolapasta commented 4 years ago

6512 turned out to be the same type of issue with the viewParam as #6448 and was solved in the same way, using

scolapasta commented 4 years ago

Update: there are currently no known server version specific issues with the upgrade branch. Please feel free to try out http://payara5.odum.unc.edu/ and find some!

(as all the current fixes should work in Glassfish 4, and we are going to run the current fixes through the process to get them merged into develop)

mheppler commented 4 years ago

In reply to the suggestion from @scolapasta in the comment above that current fixes "should work" in Glassfish 4. I can confirm that I have built the 6230-glassfish-upgrade branch locally on GlassFish Server 4.1 and all the known issues outlined in the checklist above are working as expected.

mheppler commented 4 years ago

Found a reference to an issue with Glassfish in this issue, Permissions: Grant access errors for large # of files #2641. We should look into retesting this in Payara/Glassfish 5.

@sekmiller commented on Mar 22, 2016

https://java.net/jira/browse/GLASSFISH-19470

This is an issue tracking the error message we are seeing. There doesn't appear to be any activity on it. I left a note to say that we are seeing it in Glassfish 4.1.

poikilotherm commented 4 years ago

Payara 5.201 has been released yesterday.

It contains the JSF 2.3.14 update, so no manual patching necessary.

donsizemore commented 4 years ago

@poikilotherm @mheppler I started rebuilding payara5.odum.unc.edu with CentOS 8 and Payara-5.201 yesterday but the installer blew up. Picking that up this morning.

scolapasta commented 4 years ago

OK, new checklist, as we find new issues:

scolapasta commented 4 years ago

The first issue (and possibly 2nd and 3rd) is due to an ArrayIndexOutofBonds in jsf code. We'll likely have to solve thus was a patch. See this mojarra issue: https://github.com/eclipse-ee4j/mojarra/issues/4647 The other alternative would be to remove the EL that evaluates to an empty string from the Javascript, but it feels like that a) may not be possible, b) would have to be a change in how we handle this page.

scolapasta commented 4 years ago

OK! I think we know the cause of the last two now, as well. They're our old friend for which we had other issues earlier in this very issue and for which we created a sub issue: #6448. So I'll make the appropriate changes there and make a patch for jsf (at least for the purpose of testing) and we'll convene testing on Monday.

scolapasta commented 4 years ago

The last two were as suspected and have been fixed (for templates changed to o:viewparam; for guestbook, we were able to remove the viewparam for editMode entirely. Next up, figuring out how to make this patch.

scolapasta commented 4 years ago

The first three issues are fixed with patching jsf (with an experimental jar I've created). I'll ping Payara folks to see what they think and we'll make decisions from there.

scolapasta commented 4 years ago

Keeping open until we decide on the custom jar file that solves the other three items from the recent list.

scolapasta commented 4 years ago

One last viewParam issue on editdatafiles.xhtml to fix.