metabrainz / musicbrainz-docker

Docker Compose project for the MusicBrainz Server with replication, search, and development setup
https://musicbrainz.org/doc/MusicBrainz_Server/Setup
297 stars 75 forks source link

MBVM-88: Fetch DB dumps via HTTP instead of FTP #242

Closed atj closed 1 year ago

atj commented 1 year ago

User have reported issues fetching DB dumps using FTP, and further investigation revealed that they were due to the use of IPv4-mapped IPv6 addresses. Unfortunately this issue cannot be resolved easily due to the idiosyncratic nature of the FTP protocol. This commit changes the default download protocol from FTP to HTTP, which will resolve the issue and allow us to make DB dumps available over IPv6 in future.

Note that variable names containing FTP have not been changed for backward compatibility reasons.

atj commented 1 year ago

This is the only test I've run:

$ docker-compose run --rm musicbrainz createdb.sh -sample -fetch
[+] Running 6/6
 ✔ Network musicbrainz-docker_default        Created                                                                                                                                                                                     0.1s 
 ✔ Container musicbrainz-docker-validator-1  Created                                                                                                                                                                                     0.2s 
 ✔ Container musicbrainz-docker-db-1         Created                                                                                                                                                                                     0.2s 
 ✔ Container musicbrainz-docker-redis-1      Created                                                                                                                                                                                     0.2s 
 ✔ Container musicbrainz-docker-mq-1         Created                                                                                                                                                                                     0.2s 
 ✔ Container musicbrainz-docker-search-1     Created                                                                                                                                                                                     0.2s 
[+] Running 5/5
 ✔ Container musicbrainz-docker-mq-1         Started                                                                                                                                                                                     0.9s 
 ✔ Container musicbrainz-docker-db-1         Started                                                                                                                                                                                     0.6s 
 ✔ Container musicbrainz-docker-validator-1  Started                                                                                                                                                                                     0.6s 
 ✔ Container musicbrainz-docker-redis-1      Started                                                                                                                                                                                     0.9s 
 ✔ Container musicbrainz-docker-search-1     Started                                                                                                                                                                                     0.9s 
Mon 17 Apr 2023 12:16:54 PM UTC: Fetching database dump...
--2023-04-17 12:16:54--  http://ftp.eu.metabrainz.org/pub/musicbrainz/data/sample/LATEST
Resolving ftp.eu.metabrainz.org (ftp.eu.metabrainz.org)... 138.201.203.43
Connecting to ftp.eu.metabrainz.org (ftp.eu.metabrainz.org)|138.201.203.43|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 16 [application/octet-stream]
Saving to: ‘/media/dbdump/LATEST’

LATEST                                                      100%[=========================================================================================================================================>]      16  --.-KB/s    in 0.008s  

2023-04-17 12:16:57 (1.85 KB/s) - ‘/media/dbdump/LATEST’ saved [16/16]

--2023-04-17 12:16:57--  http://ftp.eu.metabrainz.org/pub/musicbrainz/data/sample/20230401-000001/mbdump-sample.tar.xz
Resolving ftp.eu.metabrainz.org (ftp.eu.metabrainz.org)... 138.201.203.43
Connecting to ftp.eu.metabrainz.org (ftp.eu.metabrainz.org)|138.201.203.43|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 286747812 (273M) [application/octet-stream]
Saving to: ‘/media/dbdump/mbdump-sample.tar.xz’

mbdump-sample.tar.xz                                        100%[=========================================================================================================================================>] 273.46M  36.5MB/s    in 6.4s    

2023-04-17 12:17:04 (42.9 MB/s) - ‘/media/dbdump/mbdump-sample.tar.xz’ saved [286747812/286747812]

Mon 17 Apr 2023 12:17:04 PM UTC: Done fetching dump files.

I'm not sure about the changes to fetch-dump.sh, is anyone likely to be using a custom FTP server?

atj commented 1 year ago

Here's my limited testing so far:

❯ docker-compose run --rm musicbrainz createdb.sh -sample -fetch
[+] Running 5/5
 ✔ Container musicbrainz-docker-redis-1      Created                                                                                                                                                                                     0.5s 
 ✔ Container musicbrainz-docker-mq-1         Created                                                                                                                                                                                     0.5s 
 ✔ Container musicbrainz-docker-validator-1  Created                                                                                                                                                                                     0.5s 
 ✔ Container musicbrainz-docker-db-1         Created                                                                                                                                                                                     0.5s 
 ✔ Container musicbrainz-docker-search-1     Created                                                                                                                                                                                     0.5s 
[+] Running 5/5
 ✔ Container musicbrainz-docker-mq-1         Started                                                                                                                                                                                     0.6s 
 ✔ Container musicbrainz-docker-validator-1  Started                                                                                                                                                                                     0.6s 
 ✔ Container musicbrainz-docker-redis-1      Started                                                                                                                                                                                     1.0s 
 ✔ Container musicbrainz-docker-db-1         Started                                                                                                                                                                                     1.0s 
 ✔ Container musicbrainz-docker-search-1     Started                                                                                                                                                                                     1.0s 
Mon 17 Apr 2023 04:57:34 PM UTC: Fetching database dump...
--2023-04-17 16:57:34--  http://ftp.eu.metabrainz.org/pub/musicbrainz/data/sample/LATEST
Resolving ftp.eu.metabrainz.org (ftp.eu.metabrainz.org)... 138.201.203.43
Connecting to ftp.eu.metabrainz.org (ftp.eu.metabrainz.org)|138.201.203.43|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 16 [application/octet-stream]
Saving to: ‘/media/dbdump/LATEST’

LATEST                                                      100%[=========================================================================================================================================>]      16  --.-KB/s    in 0s      

2023-04-17 16:57:34 (6.20 MB/s) - ‘/media/dbdump/LATEST’ saved [16/16]
❯ export MUSICBRAINZ_BASE_FTP_URL=ftp://ftp.eu.metabrainz.org/pub/musicbrainz 
❯ docker-compose run --rm musicbrainz createdb.sh -sample -fetch
[+] Running 5/5
 ✔ Container musicbrainz-docker-validator-1  Created                                                                                                                                                                                     0.6s 
 ✔ Container musicbrainz-docker-mq-1         Created                                                                                                                                                                                     0.6s 
 ✔ Container musicbrainz-docker-db-1         Created                                                                                                                                                                                     0.6s 
 ✔ Container musicbrainz-docker-redis-1      Created                                                                                                                                                                                     0.5s 
 ✔ Container musicbrainz-docker-search-1     Created                                                                                                                                                                                     0.6s 
[+] Running 5/5
 ✔ Container musicbrainz-docker-db-1         Started                                                                                                                                                                                     0.6s 
 ✔ Container musicbrainz-docker-redis-1      Started                                                                                                                                                                                     0.6s 
 ✔ Container musicbrainz-docker-search-1     Started                                                                                                                                                                                     1.0s 
 ✔ Container musicbrainz-docker-validator-1  Started                                                                                                                                                                                     0.9s 
 ✔ Container musicbrainz-docker-mq-1         Started                                                                                                                                                                                     0.9s 
Mon 17 Apr 2023 05:01:04 PM UTC: Fetching database dump...
--2023-04-17 17:01:04--  ftp://ftp.eu.metabrainz.org/pub/musicbrainz/data/sample/LATEST
           => ‘/media/dbdump/LATEST’
Resolving ftp.eu.metabrainz.org (ftp.eu.metabrainz.org)... 138.201.203.43
Connecting to ftp.eu.metabrainz.org (ftp.eu.metabrainz.org)|138.201.203.43|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD (1) /pub/musicbrainz/data/sample ... done.
==> SIZE LATEST ... 16
==> PASV ... done.    ==> RETR LATEST ... done.
Length: 16 (unauthoritative)

LATEST                                                      100%[=========================================================================================================================================>]      16  --.-KB/s    in 0.001s  

2023-04-17 17:01:05 (25.7 KB/s) - ‘/media/dbdump/LATEST’ saved [16]
❯ export MUSICBRAINZ_MIRROR_URL=ftp://ftp.eu.metabrainz.org/pub/musicbrainz
❯ docker-compose run --rm musicbrainz createdb.sh -sample -fetch
[+] Running 5/5
 ✔ Container musicbrainz-docker-db-1         Created                                                                                                                                                                                     0.5s 
 ✔ Container musicbrainz-docker-redis-1      Created                                                                                                                                                                                     0.5s 
 ✔ Container musicbrainz-docker-mq-1         Created                                                                                                                                                                                     0.5s 
 ✔ Container musicbrainz-docker-validator-1  Created                                                                                                                                                                                     0.5s 
 ✔ Container musicbrainz-docker-search-1     Created                                                                                                                                                                                     0.5s 
[+] Running 5/5
 ✔ Container musicbrainz-docker-search-1     Started                                                                                                                                                                                     0.9s 
 ✔ Container musicbrainz-docker-validator-1  Started                                                                                                                                                                                     0.6s 
 ✔ Container musicbrainz-docker-db-1         Started                                                                                                                                                                                     0.9s 
 ✔ Container musicbrainz-docker-redis-1      Started                                                                                                                                                                                     0.9s 
 ✔ Container musicbrainz-docker-mq-1         Started                                                                                                                                                                                     0.6s 
Mon 17 Apr 2023 05:03:25 PM UTC: Fetching database dump...
--2023-04-17 17:03:25--  ftp://ftp.eu.metabrainz.org/pub/musicbrainz/data/sample/LATEST
           => ‘/media/dbdump/LATEST’
Resolving ftp.eu.metabrainz.org (ftp.eu.metabrainz.org)... 138.201.203.43
Connecting to ftp.eu.metabrainz.org (ftp.eu.metabrainz.org)|138.201.203.43|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD (1) /pub/musicbrainz/data/sample ... done.
==> SIZE LATEST ... 16
==> PASV ... done.    ==> RETR LATEST ... done.
Length: 16 (unauthoritative)

LATEST                                                      100%[=========================================================================================================================================>]      16  --.-KB/s    in 0.001s  

2023-04-17 17:03:25 (24.2 KB/s) - ‘/media/dbdump/LATEST’ saved [16]
❯ export MUSICBRAINZ_MIRROR_URL=ftp.eu.metabrainz.org/pub/musicbrainz
❯ docker-compose run --rm musicbrainz createdb.sh -sample -fetch
[+] Running 5/5
 ✔ Container musicbrainz-docker-redis-1      Created                                                                                                                                                                                     0.4s 
 ✔ Container musicbrainz-docker-search-1     Created                                                                                                                                                                                     0.4s 
 ✔ Container musicbrainz-docker-db-1         Created                                                                                                                                                                                     0.4s 
 ✔ Container musicbrainz-docker-mq-1         Created                                                                                                                                                                                     0.4s 
 ✔ Container musicbrainz-docker-validator-1  Created                                                                                                                                                                                     0.4s 
[+] Running 5/5
 ✔ Container musicbrainz-docker-validator-1  Started                                                                                                                                                                                     0.9s 
 ✔ Container musicbrainz-docker-mq-1         Started                                                                                                                                                                                     0.9s 
 ✔ Container musicbrainz-docker-search-1     Started                                                                                                                                                                                     0.9s 
 ✔ Container musicbrainz-docker-redis-1      Started                                                                                                                                                                                     0.6s 
 ✔ Container musicbrainz-docker-db-1         Started                                                                                                                                                                                     0.6s 
[+] Building 2.2s (13/13) FINISHED                                                                                                                                                                                                            
fetch-dump.sh: --mirror-url must begin with ftp://, http:// or https://
atj commented 1 year ago

Thanks for your help and feedback on this @yvanzo!