heavyai / heavydb

HeavyDB (formerly OmniSciDB)
https://heavy.ai
Apache License 2.0
2.93k stars 445 forks source link

Server Error: System catalog system_catalog does not exist. #783

Closed paulasanematsu closed 1 year ago

paulasanematsu commented 1 year ago

Hi. I work at Harvard FAS Research Computing. We support a group that has used OmniSci on our HPC systems for a few years. We deploy OmniSci using a Singularity container that is built based on your Docker containers. We are behind and currently running version 5.5.5. I am trying to update to the HeavyAI latest version 7.0.0, but I am running into some errors.

I built a Singularity container based on the latest Docker container on HeavyAI DockerHub with:

singularity pull docker://heavyai/heavyai-ee-cuda:latest

I run the Singularity container with the following command:

singularity run --nv -B /etc/nsswitch.conf -B /etc/sssd/ -B /var/lib/sss -B /etc/slurm -B /slurm -B /var/run/munge -B /usr/bin/sbatch -B /usr/bin/srun -B /usr/bin/sacct -B /usr/bin/scontrol -B /usr/lib64/slurm/ -B /scratch/paulasan/55125905/var/lib/heavyai:/var/lib/heavyai --pwd /opt/heavyai /n/singularity_images/OOD/omnisci/heavyai-ee-cuda_latest.sif

The connections seem to be working and the Rebrand migration looks completed, but I suspect that the system_catalog error is causing startheavy to exit. This is the output that I get (I tried to point out the printout from FASRC script):

startheavy 3539282 running
WARN: config file does not exist, ignoring: --config /var/lib/heavyai/heavy.conf
Backend TCP:  localhost:7738
Backend HTTP: localhost:9625
Frontend Web: localhost:8732
Calcite TCP:  localhost:9462
- heavydb 3539291 started
- heavy_web_server 3539292 started
⇨ http server started on [::]:8732
Discovered OmniSci  server listening on port 8732!        # printout from FASRC script
TIMING - Wait ended at: Wed May 24 08:00:33 EDT 2023.     # printout from FASRC script
Rebrand migration: Added symlink from "/var/lib/heavyai/storage/mapd_catalogs" to "catalogs"
Rebrand migration: Added symlink from "/var/lib/heavyai/storage/mapd_data" to "data"
Rebrand migration: Added symlink from "/var/lib/heavyai/storage/mapd_export" to "export"
Rebrand migration: Added symlink from "/var/lib/heavyai/storage/mapd_import" to "import"
Rebrand migration: Added symlink from "/var/lib/heavyai/storage/mapd_log" to "log"
Rebrand migration completed
Server Error: System catalog system_catalog does not exist.
Generating connection YAML file...       # printout from FASRC script
Navigate to: http://localhost:8732
Failed to write to log, write /var/lib/heavyai/storage/log/heavy_web_server.holygpu7c26101.rc.fas.harvard.edu.paulasan.log.ALL.20230524-120032.3539292: file already closed
startheavy 3539282 exited
Cleaning up...

In our older version 5.5.5 the file omnisci_system_catalog is generated, but not in version 7.0.0, that's why I suspect the missing system_catalog is causing a problem.

Could you please provide some assistance on how to solve this issue? Let me know if you need any further information that I may have missed.

Thank you!

cdessanti commented 1 year ago

Hi Paula, The issue could be generated because moving from 5.5 to 7.0 requires an intermediate upgrade step to 6.0 before migrating to 7.0 as stated in the docs.

https://docs.heavy.ai/installation-and-configuration/installation/upgrading-omnisci

The steps to migrate to 6.0 using a docker container follow this link.

https://docs.heavy.ai/installation-and-configuration/installation/upgrading-omnisci/upgrading-omnisci-1

Then you can migrate to 7.0. The 7.0 removes render groups, so a backup should be taken before the upgrade.

If you need further assistance, feel free to ask.

I can say that the migration from 5.x to 6.0 is creating a symlink and is going to rename the system catalog file this way.

lrwxrwxrwx  1 mapd mapd      14 gen 25 18:35 omnisci_system_catalog -> system_catalog
-rw-r--r--  1 mapd mapd   57344 apr  4 19:00 system_catalog

But I'm not sure that doing this is enough for a reliable system.

Have you taken a backup before the upgrade?

regards, Candido

paulasanematsu commented 1 year ago

Hi Candido,

Thank you for getting back to me quickly. I pulled the 6.0.0 container but I am getting the same error.

To clarify, I don't have a database. We provide the Immerse interface and our users load their databases afterwards. This tutorial shows the interface that I am working on.

I don't have the omnisci_system_catalog stored anywhere. In version 5.5.5, this file is somehow generated when I run the Singularity container. These are the files generated after running the container:

[jharvard@holygpu2c0709 55175673]$ pwd
/scratch/jharvard/55175673
[jharvard@holygpu2c0709 55175673]$ cd omnisci-storage/data/mapd_catalogs/
[jharvard@holygpu2c0709 mapd_catalogs]$ ls -l
total 128
-rw-r--r-- 1 jharvard jharvard_lab 77824 May 24 15:31 omnisci
-rw-r--r-- 1 jharvard jharvard_lab 53248 May 24 15:31 omnisci_system_catalog

However, in version 6.0.0 (and 7.0.0) the system_catalog file is not generated. And since there is no existing omnisci_system_catalog, a symlink cannot be created either?


[root@holygpu7c26101 ~]# cd /scratch/jharvard/55174492/var/lib/heavyai/storage/
[root@holygpu7c26101 storage]# ls -l
total 8
drwxr-xr-x 2 jharvard jharvard_lab    6 May 24 15:17 catalogs
drwxr-xr-x 2 jharvard jharvard_lab    6 May 24 15:17 data
drwxr-xr-x 2 jharvard jharvard_lab    6 May 24 15:17 export
-rw-r--r-- 1 jharvard jharvard_lab    7 May 24 15:17 heavydb_pid.lck
drwxr-xr-x 2 jharvard jharvard_lab    6 May 24 15:17 import
drwxr-xr-x 2 jharvard jharvard_lab    6 May 24 15:17 lockfiles
drwxr-xr-x 2 jharvard jharvard_lab 4096 May 24 15:17 log
lrwxrwxrwx 1 jharvard jharvard_lab    8 May 24 15:17 mapd_catalogs -> catalogs
lrwxrwxrwx 1 jharvard jharvard_lab    4 May 24 15:17 mapd_data -> data
lrwxrwxrwx 1 jharvard jharvard_lab    6 May 24 15:17 mapd_export -> export
lrwxrwxrwx 1 jharvard jharvard_lab    6 May 24 15:17 mapd_import -> import
lrwxrwxrwx 1 jharvard jharvard_lab    3 May 24 15:17 mapd_log -> log
[root@holygpu7c26101 storage]# ls -l catalogs/
total 0
cdessanti commented 1 year ago

Hi Paula,

When you run the 6.0 against a storage of omnisci, files and directories are renamed and links are generated, and the catalogs and datafiles are processed as an upgrade step.

So the question is... Are trying to upgrade from a 5.x or you just want to create an environment with a new version for your users?

Inviato da Outlook per Androidhttps://aka.ms/AAb9ysg


From: Paula C. Sanematsu @.> Sent: Wednesday, May 24, 2023 9:42:42 PM To: heavyai/heavydb @.> Cc: Candido Dessanti @.>; Comment @.> Subject: Re: [heavyai/heavydb] Server Error: System catalog system_catalog does not exist. (Issue #783)

Hi Candido,

Thank you for getting back to me quickly. I pulled the 6.0.0 container but I am getting the same error.

To clarify, I don't have a database. We provide the Immerse interface and our users load their databases afterwards. This tutorialhttps://www.youtube.com/watch?v=4kraLwo3suI&t=2s shows the interface that I am working on.

I don't have the omnisci_system_catalog stored anywhere. In version 5.5.5, this file is somehow generated when I run the Singularity container. These are the files generated after running the container:

@. 55175673]$ pwd /scratch/jharvard/55175673 @. 55175673]$ cd omnisci-storage/data/mapd_catalogs/ @.*** mapd_catalogs]$ ls -l total 128 -rw-r--r-- 1 jharvard jharvard_lab 77824 May 24 15:31 omnisci -rw-r--r-- 1 jharvard jharvard_lab 53248 May 24 15:31 omnisci_system_catalog

However, in version 6.0.0 (and 7.0.0) the system_catalog file is not generated. And since there is no existing omnisci_system_catalog, a symlink cannot be created either?

@. ~]# cd /scratch/jharvard/55174492/var/lib/heavyai/storage/ @. storage]# ls -l total 8 drwxr-xr-x 2 jharvard jharvard_lab 6 May 24 15:17 catalogs drwxr-xr-x 2 jharvard jharvard_lab 6 May 24 15:17 data drwxr-xr-x 2 jharvard jharvard_lab 6 May 24 15:17 export -rw-r--r-- 1 jharvard jharvard_lab 7 May 24 15:17 heavydb_pid.lck drwxr-xr-x 2 jharvard jharvard_lab 6 May 24 15:17 import drwxr-xr-x 2 jharvard jharvard_lab 6 May 24 15:17 lockfiles drwxr-xr-x 2 jharvard jharvard_lab 4096 May 24 15:17 log lrwxrwxrwx 1 jharvard jharvard_lab 8 May 24 15:17 mapd_catalogs -> catalogs lrwxrwxrwx 1 jharvard jharvard_lab 4 May 24 15:17 mapd_data -> data lrwxrwxrwx 1 jharvard jharvard_lab 6 May 24 15:17 mapd_export -> export lrwxrwxrwx 1 jharvard jharvard_lab 6 May 24 15:17 mapd_import -> import lrwxrwxrwx 1 jharvard jharvard_lab 3 May 24 15:17 mapd_log -> log @.*** storage]# ls -l catalogs/ total 0

— Reply to this email directly, view it on GitHubhttps://github.com/heavyai/heavydb/issues/783#issuecomment-1561831861, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHLFBF7A6FEGJHSKPWRONADXHZQDFANCNFSM6AAAAAAYNNXM5Y. You are receiving this because you commented.Message ID: @.***>

paulasanematsu commented 1 year ago

I only want to create an environment with a new version for our users.

When I sent my first inquiry, I was creating the directories following directories manually:

mkdir -p ${OMNIROOT}/var/lib/heavyai/storage/data
mkdir -p ${OMNIROOT}/var/lib/heavyai/storage/catalogs
mkdir -p ${OMNIROOT}/var/lib/heavyai/storage/import
mkdir -p ${OMNIROOT}/var/lib/heavyai/storage/export
mkdir -p ${OMNIROOT}/var/lib/heavyai/storage/log
mkdir -p ${OMNIROOT}/var/lib/heavyai/storage/lockfiles

where $OMNIROOT is a local temporary storage.

After you asked the question about a new environment, I realized that creating those directories manually could pose a problem. So, I removed those mkdir statements. That solved my problem and it seems to work! Well, at least I can get to the Immerse interface, which is all I need. Now I can ask our users to test it.

Thanks a lot for your guidance and time!

cdessanti commented 1 year ago

Yes, it's not needed creating the subdirectories manually, but just the root /varlib/heavyai (or whatever), and then specify it in the data parameter of config file. If empty, at the server startup, all the necessary structures will be created amd initialized, but if the directories exist, it will try an upgrade/rebrand.

When I saw these messages I thought that you where trying an upgrade from omnisci 5.5

I'm happy that everything is OK now, and I hope you are appreciating out suite.

Regards, Candido

Inviato da Outlook per Androidhttps://aka.ms/AAb9ysg


From: Paula C. Sanematsu @.> Sent: Thursday, May 25, 2023 2:05:20 AM To: heavyai/heavydb @.> Cc: Candido Dessanti @.>; Comment @.> Subject: Re: [heavyai/heavydb] Server Error: System catalog system_catalog does not exist. (Issue #783)

I only want to create an environment with a new version for our users.

When I sent my first inquiry, I was creating the directories following directories manually:

mkdir -p ${OMNIROOT}/var/lib/heavyai/storage/data mkdir -p ${OMNIROOT}/var/lib/heavyai/storage/catalogs mkdir -p ${OMNIROOT}/var/lib/heavyai/storage/import mkdir -p ${OMNIROOT}/var/lib/heavyai/storage/export mkdir -p ${OMNIROOT}/var/lib/heavyai/storage/log mkdir -p ${OMNIROOT}/var/lib/heavyai/storage/lockfiles

where $OMNIROOT is a local temporary storage.

After you asked the question about a new environment, I realized that creating those directories manually could pose a problem. So, I removed those mkdir statements. That solved my problem and it seems to work! Well, at least I can get to the Immerse interface, which is all I need. Now I can ask our users to test it.

Thanks a lot for your guidance and time!

— Reply to this email directly, view it on GitHubhttps://github.com/heavyai/heavydb/issues/783#issuecomment-1562078566, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHLFBF5A42I6DZBPLRDVES3XH2O4BANCNFSM6AAAAAAYNNXM5Y. You are receiving this because you commented.Message ID: @.***>

paulasanematsu commented 1 year ago

Thank you. Now that I understand what was happening with the existing directories and the migration, it make a lot more sense. Thank you for your help. I reached out to our users for testing, but I think we resolved this issue. I will close it.

Again, thank you very much for your guidance.