IQSS / dataverse

Open source research data repository software
http://dataverse.org
Other
882 stars 494 forks source link

dataverse 6.3 fail to create table #10765

Open link89 opened 3 months ago

link89 commented 3 months ago

What steps does it take to reproduce the issue? Run python install.py in a fresh environment.

Command start-domain executed successfully.
/tmp/dvinstall
Payara setup complete
Fri Aug  9 14:34:57 UTC 2024

Installing additional configuration files (Jhove)...
done.
Deploying the application (dataverse.war)

PER01003: Deployment encountered SQL Exceptions:
        PER01000: Got SQLException executing statement "ALTER TABLE fileaccessrequests ADD CONSTRAINT FK_fileaccessrequests_DATAFILE_ID FOREIGN KEY (DATAFILE_ID) REFERENCES DVOBJECT (ID)": org.postgresql.util.PSQLException: ERROR: constraint "fk_fileaccessrequests_datafile_id" for relation "fileaccessrequests" already exists

Which version of Dataverse are you using?

    "version": "6.3",
    "build": "1607-8c99a74"
link89 commented 3 months ago

After tons of failure I find this problem can be work around by run ./setup-all.sh again. I think it is possible that after the dataverse.war is installed, it will take sometime for initialization: create table , etc. if ./setup-all.sh get executed before everything is ready then the 404 error will happens.

I think a better solution is to find a method to check if dataverse is ready before running ./setup-all.sh script. A quick way to reduce the chance of this error is to add extra delay before ./setup-all.sh.

mlage commented 1 week ago

Hello, everyone.

I also get this error when I try to perform a fresh install of Dataverse version 6.4 (build 1609-906f874) on a Ubuntu server 24.04 VirtualBox machine.

If I keep trying to run the install.py script with the --force option, the installation eventually succeeds.

Does someone have any tips on how to avoid this issue?

Thanks!

qqmyers commented 1 week ago

FWIW: Errors of the type shown above are not fatal (I see them in the log every time we deploy an update). There's some work being done to avoid them but they essentially show that Dataverse is trying to make database changes that have already been done (so when it fails, the database is still in the correct state). If your overall deployment is failing, something else is going on.

mlage commented 1 week ago

Thanks for your reply!

The only error displayed in the console is:

PER01003: Deployment encountered SQL Exceptions: PER01000: Got SQLException executing statement "ALTER TABLE fileaccessrequests ADD CONSTRAINT FK_fileaccessrequests_DATAFILE_ID FOREIGN KEY (DATAFILE_ID) REFERENCES DVOBJECT (ID)": org.postgresql.util.PSQLException: ERROR: constraint "fk_fileaccessrequests_datafile_id" for relation "fileaccessrequests" already exists

After that message is shown, the installer exits (returnCode != 0). As I said, if I keep trying to run the script with the --force option, the error eventually does not appear and the installation process succeeds.

Since the behaviour is random, can it be related to previous steps of the instatal.py script that did not complete before the war deployment or any other similar problem?

qqmyers commented 1 week ago

Possibly - note that #10766 was recently proposed and may help. (I think it was rejected in favor of a TBD solution polling for Dataverse to be ready rather than using a fixed wait time.)

mlage commented 1 week ago

Thanks again for your support.

I have already seen this proposal and tried it. But I don't think it makes sense since the deployment fails to finish, and the sleep command will never run if it is positioned between the war file deploy and the setup-all script.

I was trying to install again and notice that there is another message that shows up just above the other error message:

Error occurred during deployment: Exception while loading the app : java.lang.IllegalStateException: ContainerBase.addChild: start: org.apache.catalina.LifecycleException: org.apache.catalina.LifecycleException: java.lang.IllegalStateException: OmniFaces failed to initialize! Report an issue to OmniFaces.

Checking the payara log file, I found the following error:

... at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:515) at java.base/java.lang.Thread.run(Thread.java:840) Caused by: java.lang.NullPointerException: Cannot invoke "String.equals(Object)" because the return value of "jakarta.servlet.Servlet> at org.omnifaces.util.Platform.getFacesServletRegistration(Platform.java:57) ...

I'm attaching here the log file.

server.log

mlage commented 1 week ago

It seems that the problem is related to the thread allocation performed by Java when it runs in the VirtualBox environment, so I'm not sure if the bug is related to the Dataverse code.

I just completed the installation several times using virtual machines with just 1 core.

Thanks @qqmyers for your help!

pdurbin commented 1 week ago

Very strange that it works on one core but not multiple. 🤔 Thanks for the heads up, @mlage.

mlage commented 1 week ago

@pdurbin Thanks for your reply.

Unfortunately, you are right. The error is less frequent, but it is still there.

To give you some context:

Here in Brazil, we have a project at RNP that proposes using Dataverse as the standard solution to build open data institutional repositories in public research labs, universities, etc. In the context of this project, we are now offering a course to TI professionals, researchers, and enthusiasts from several institutions on how to set up and configure a Dataverse server.

In the course, we adopted VirtualBox as the environment for building Dataverse servers. Yesterday, during the installation class, I got the error again, even though I was using only one thread. Some students also reported the issue (although most of them could finish the installation without any problem).

As I reported before, the error is related to a thread allocation, but I am still looking for a definitive solution. I would be very grateful if you have any suggestions on how to fix or avoid this error.

pdurbin commented 1 week ago

@mlage hmm, is Docker an option? Can you give this "demo or evaluation" tutorial a try? https://guides.dataverse.org/en/6.4/container/running/demo.html

Feedback is very welcome in #containers at https://chat.dataverse.org !

mlage commented 1 week ago

@prsridha We studied this possibility, but we decided to adopt VirtualBox since the documentation says that running Dataverse in a Docker environment was not recommended for production. Also, with VirtualBox, we could better simulate the real-world scenario of connecting via SSH to the server and performing the required actions to install and configure Dataverse.

Maybe we made a wrong decision, but unfortunately, now that the course is already running, it is too late to change it.

Thanks again for your support!

Best!

link89 commented 1 week ago

since the documentation says that running Dataverse in a Docker environment was not recommended for production.

@mlage I make my own docker solution for my organization which strictly following the official production deployment. The problem I mentioned in this ticket also fixed in the docker setup by execute setup-all.sh again after the python script. You may check if this solution works for you。

https://github.com/link89/dataverse-docker

mlage commented 1 week ago

Hey @link89!

Initially, I thought we had the same issue because I also got the "PER01003: Deployment encountered SQL Exceptions" message.

However, after a closer look into the log files, I observed that, eventually, the install script raises an error during the war file deployment, and the script exits without deploying Dataverse.

For this reason, I can tell you that your solution doesn't work for me, unfortunately.

Thanks for your feedback!

pdurbin commented 1 week ago

@mlage are you on Windows? Is WSL an option? @lubitchv added some docs recently in #10608. You can find them at https://guides.dataverse.org/en/6.4/developers/windows.html . She's using Ubuntu within WSL.

pdurbin commented 1 week ago

Oh, you said you're running on Ubuntu already. Running VirtualBox on Ubuntu. Have you tried installing Dataverse directly on Ubuntu, maybe with https://github.com/gdcc/dataverse-ansible ? That is, not using VirtualBox, if that's the problem?

mlage commented 1 week ago

@pdurbin In fact I'm running on Windows. The VirtualBox VM is running Ubuntu server 24.04.

I will check the links you sent.

Thanks!

pdurbin commented 1 week ago

@mlage great. It's a dev environment. Send us a pull request! 😜

Baroti commented 1 week ago

We get the same error when we run the python based installer (or when we try to deploy the dataverse-6.4.war file) on new Ubuntu 22.04.4 machine deployed in KVM virtualization environment (we are a chef opscode shop, not ansible, sorry). PER01003: Deployment encountered SQL Exceptions: PER01000: Got SQLException executing statement "ALTER TABLE fileaccessrequests ADD CONSTRAINT FK_fileaccessrequests_DATAFILE_ID FOREIGN KEY (DATAFILE_ID) REFERENCES DVOBJECT (ID)": org.postgresql.util.PSQLException: ERROR: constraint "fk_fileaccessrequests_datafile_id" for relation "fileaccessrequests" already exists Command deploy failed. We will keep digging.

pdurbin commented 6 days ago

@Baroti can you please provide more of your server.log?

Baroti commented 1 day ago

@pdurbin below are the last SQL statements, it starts 1ith no. 1 and it fails at 404:

2024-11-14 15:28:57.328 EST [17437] dataverse@dataverseLOG: AUDIT: SESSION,1,1,DDL,CREATE TABLE,,,"CREATE TABLE EXTERNALTOOLTYPE (ID SERIAL NOT NULL, TYPE VARCHAR(255) NOT NULL, EXTERNALTOOL_ID BIGINT NOT NULL, PRIMARY KEY (ID))", ... 2024-11-14 15:28:59.001 EST [17437] dataverse@dataverseLOG: AUDIT: SESSION,400,1,DDL,ALTER TABLE,,,ALTER TABLE explicitgroup_explicitgroup ADD CONSTRAINT FK_explicitgroup_explicitgroup_containedexplicitgroups_id FOREIGN KEY (containedexplicitgroups_id) REFERENCES EXPLICITGROUP (ID),

2024-11-14 15:28:59.002 EST [17437] dataverse@dataverseLOG: AUDIT: SESSION,401,1,DDL,ALTER TABLE,,,ALTER TABLE PendingWorkflowInvocation_LOCALDATA ADD CONSTRAINT PndngWrkflwInvocationLOCALDATAPndngWrkflwInvocationINVOCATIONID FOREIGN KEY (PendingWorkflowInvocation_INVOCATIONID ) REFERENCES PENDINGWORKFLOWINVOCATION (INVOCATIONID),

2024-11-14 15:28:59.004 EST [17437] dataverse@dataverseLOG: AUDIT: SESSION,402,1,DDL,ALTER TABLE,,,ALTER TABLE VARGROUP_DATAVARIABLE ADD CONSTRAINT FK_VARGROUP_DATAVARIABLE_varsInGroup_ID FOREIGN KEY (varsInGroup_ID) REFERENCES DATAVARIABLE (ID),

2024-11-14 15:28:59.005 EST [17437] dataverse@dataverseLOG: AUDIT: SESSION,403,1,DDL,ALTER TABLE,,,ALTER TABLE VARGROUP_DATAVARIABLE ADD CONSTRAINT FK_VARGROUP_DATAVARIABLE_VarGroup_ID FOREIGN KEY (VarGroup_ID) REFERENCES VARGROUP (ID),

2024-11-14 15:28:59.011 EST [17437] dataverse@dataverseLOG: AUDIT: SESSION,404,1,DDL,CREATE TABLE,,,"CREATE TABLE SEQUENCE (SEQ_NAME VARCHAR(50) NOT NULL, SEQ_COUNT DECIMAL(38), PRIMARY KEY (SEQ_NAME))",

pdurbin commented 1 day ago

@Baroti thanks. Weird. Sorry, nothing is jumping to mind what the problem might be. As far as I know, Dataverse 6.4 can be installed just fine in most environments. 🤔