Closed echoxxzz closed 1 year ago
Thanks for this description. Unfortunately, I get a lot of "Skipping... table not found" issues while importing the data. I'm not sure, what I'm doing wrong.
Are you sure got the latest SQL scripts from the Musicbrainz GIT repo and copied them to the appropriate folder replacing the ones that come with mbdata?
I did it. But maybe I copied it to the wrong folder. I will check again.
To be honest you don't really need to use the mbdata command to run the sql scripts. Just use psql and specify the database name as the last paramenter or as a input to -d.
e.g If your database is called Musicbrianz: psql -f CreatePrimaryKeys.sql Musicbrainz
Try that using the latest sql scripts then the import should work.
@echoxxzz I think, I was using a too old postgresql version (9.5). I upgraded and not it seems to work,
According to the Install.MD file on Github it requires 12 or later. I updated my instructions. Glad everything worked out for you.
When I run "mbslave psql -f CreateFunctions.sql" I get the following error:
psql:/var/folders/gd/ymqmmv7n1pnbrl303865ynbh0000gn/T/tmp3d4outcm.sql:23: ERROR: function array_append(anyarray, anyelement) does not exist
Any way around this? I did some digging and it seems like an issue with Postgresql 14+
deleted (no longer relevant)
@jeleffto The MB SQL scripts were updated recently to make them compatible with PG 14. See https://github.com/metabrainz/musicbrainz-server/commit/38861c768d427970e9088cefc76604b05ee11bab. If you try installing mbdata from the latest master or update the scripts later manually, the error should go away.
It seems like mbdata isn't being maintained anymore. Everything it does can be done using the Musicbrainz repo and it works on all supported versions of Postgresql.
Most of this info can be found in the Musicrainz Install.md file but I will go through the steps I used to setup a Musicbrainz mirrored database server with replication on Ubuntu Server 22.04.1 LTS.
Since I'm only using this server for Musicbrainz I didn't use local::lib for Perl modules and I trusted all local logins to Postgres as it just makes the Install and Mirror script easier to run.
Install Postgresql:
sudo apt install postgresql-14 postgresql-server-dev-14 postgresql-contrib
Edit the /etc/postgresql/14/main/pg_hba.conf and change this line from:
local all postgres peer
to:
local all all trust
Restart the Postgresql server:
sudo systemctl restart postgresql
Verify you can login and list the system databases:
psql -U postgres -l
Install the build tools and dependencies:
sudo apt install build-essential cpanminus libdb-dev libexpat1-dev zlib1g-dev moreutils pkg-config
Clone the Musicbrainz repository:
git clone https://github.com/metabrainz/musicbrainz-server.git
cd musicbrainz-server
Install the Perl dependencies:
sudo cpanm --installdeps --notest .
You should end up with 307 distributions installed.
Copy and edit the sample DBDefs.pm:
cp lib/DBDefs.pm.sample lib/DBDefs.pm
nano lib/DBDefs.pm
Download the latest Musicbrainz database dump files. Minimal you need mbdump.tar.bz2 & mbdump-derived.tar.bz2.
Create the database and import the dump files:
./admin/InitDb.pl --createdb --echo --import mbdump*.tar.bz2
Create a cron job to update the database hourly. Musicbrainz ships with a shell script called called mirror.sh located in the admin/cron folder. To run the script ever hour enter the following:
0 /home/username/musicbrainz-server/admin/cron/mirror.sh >/dev/null 2>&1
into
crontab -e
Enjoy.
Make sure you're using Postgre 12 or later.
Complete the first 3 steps to install mbdata and setup your mbslave.conf file.
The SQL scripts in the mbdata repo haven't been updated in over 6 months so you need to either clone the Musicbrainz git repo or download the latest Musicbrainz zip file from Github and copy and/or exact the contents of the musicbrainz-server-master/admin/sql folder and replace the files in the mbdata/sql folder.
If you used pipx the mbdata/sql folder will be located somewhere like the following:
~/.local/pipx/venvs/mbdata/lib/python3.x/site-packages/mbdata/sql
If needed, use find to locate your specific folder location.
The remainder of the steps where taken verbatim from the Musicbrainz InitDb.pl script. This will create a complete Musicbrainz slave database will all tables, indexes, etc. You may want to customize as needed if say for instance you're not importing wikidocs or other non-essential data dumps.
Create the Database:
sudo su - postgres createuser musicbrainz createdb -l C -E UNICODE -T template0 -O musicbrainz musicbrainz
MB now use DateTime->now to populate 'TIMESTAMP WITH TIME ZONE' columns in their code.
DateTime->now outputs 'floating' UTC by default, but doesn't encode any timezone info in its output, so the
database must have its timezone set to UTC in order to correctly interpret those values.
psql musicbrainz -c 'ALTER DATABASE musicbrainz SET timezone TO 'UTC';'
psql musicbrainz -c 'CREATE EXTENSION IF NOT EXISTS cube WITH SCHEMA public;' psql musicbrainz -c 'CREATE EXTENSION IF NOT EXISTS earthdistance WITH SCHEMA public;' psql musicbrainz -c 'CREATE EXTENSION IF NOT EXISTS unaccent WITH SCHEMA public;'
exit out of sudo back to the account you setup for mbdata
Prepare empty schemas:
echo 'CREATE SCHEMA musicbrainz;' | mbslave psql -S echo 'CREATE SCHEMA statistics;' | mbslave psql -S echo 'CREATE SCHEMA cover_art_archive;' | mbslave psql -S echo 'CREATE SCHEMA wikidocs;' | mbslave psql -S echo 'CREATE SCHEMA documentation;' | mbslave psql -S echo 'CREATE SCHEMA event_art_archive;' | mbslave psql -S echo 'CREATE SCHEMA json_dump;' | mbslave psql -S echo 'CREATE SCHEMA report;' | mbslave psql -S echo 'CREATE SCHEMA sitemaps;' | mbslave psql -S
Create tables structures:
The first script will give an error that the extensions already exist because we already added them in step 1. The
extensions have to be created in step 1 as you need to be a superuser account to create extensions. We just need
to run the Extensions.sql script to add the musicbrainz.ll_to_earth() function so index creation won't fail as reported
in this issue ticket.
mbslave psql -f Extensions.sql
mbslave psql -f CreateCollations.sql mbslave psql -f CreateTables.sql mbslave psql -f caa/CreateTables.sql mbslave psql -f eaa/CreateTables.sql mbslave psql -f documentation/CreateTables.sql mbslave psql -f json_dump/CreateTables.sql mbslave psql -f report/CreateTables.sql mbslave psql -f sitemaps/CreateTables.sql mbslave psql -f statistics/CreateTables.sql mbslave psql -f wikidocs/CreateTables.sql
Import the data dumps. Minimally you need the following two dump files:
mbslave import mbdump.tar.bz2 mbdump-derived.tar.bz2
Create the primary keys:
mbslave psql -f CreatePrimaryKeys.sql mbslave psql -f caa/CreatePrimaryKeys.sql mbslave psql -f documentation/CreatePrimaryKeys.sql mbslave psql -f eaa/CreatePrimaryKeys.sql mbslave psql -f statistics/CreatePrimaryKeys.sql mbslave psql -f wikidocs/CreatePrimaryKeys.sql
Create functions
mbslave psql -f CreateSearchConfiguration.sql mbslave psql -f CreateFunctions.sql mbslave psql -f caa/CreateFunctions.sql mbslave psql -f eaa/CreateFunctions.sql mbslave psql -f CreateSlaveOnlyFunctions.sql
Create indexes
mbslave psql -f CreateIndexes.sql mbslave psql -f caa/CreateIndexes.sql mbslave psql -f eaa/CreateIndexes.sql mbslave psql -f json_dump/CreateIndexes.sql mbslave psql -f sitemaps/CreateIndexes.sql mbslave psql -f statistics/CreateIndexes.sql mbslave psql -f CreateSlaveIndexes.sql
Set initial sequence values
mbslave psql -f SetSequences.sql mbslave psql -f statistics/SetSequences.sql
Create views and triggers and search indexes
mbslave psql -f CreateViews.sql mbslave psql -f caa/CreateViews.sql mbslave psql -f eaa/CreateViews.sql mbslave psql -f CreateSlaveOnlyTriggers.sql mbslave psql -f CreateSearchIndexes.sql