influxdata / influxdb

Scalable datastore for metrics, events, and real-time analytics
https://influxdata.com
Apache License 2.0
28.96k stars 3.56k forks source link

Upgrade to Docker influxdb:2.0.0-rc from 2.0.0-beta fails #19673

Closed sgreszcz closed 4 years ago

sgreszcz commented 4 years ago

Steps to reproduce: List the minimal actions needed to reproduce the behavior.

  1. Was running InfluxDB Docker image 2.0.0-beta
  2. Pulled a new image of 2.0.0-rc using Ansible/Docker-Compose
  3. Docker image of 2.0.0-rc doesn't start and throws errors

Expected behavior: Seemless upgrade with new 2.0.0-rc Docker image

Actual behavior: These errors:

ts=2020-10-01T10:54:02.698357Z lvl=info msg="Welcome to InfluxDB" log_id=0P_xbFnW000 version=2.0.0-rc.0 commit=df47ec7bb2 build_date=2020-09-29T22:08:57Z
ts=2020-10-01T10:54:02.699284Z lvl=info msg="Resources opened" log_id=0P_xbFnW000 service=bolt path=/root/.influxdbv2/influxd.bolt
ts=2020-10-01T10:54:02.700032Z lvl=info msg="Migration \"migrate task owner id\" started (up)" log_id=0P_xbFnW000 service=migrations
ts=2020-10-01T10:54:02.703616Z lvl=info msg="Migration \"migrate task owner id\" completed (up)" log_id=0P_xbFnW000 service=migrations
ts=2020-10-01T10:54:02.703633Z lvl=info msg="Migration \"create DBRP buckets\" started (up)" log_id=0P_xbFnW000 service=migrations
ts=2020-10-01T10:54:02.707709Z lvl=info msg="Migration \"create DBRP buckets\" completed (up)" log_id=0P_xbFnW000 service=migrations
ts=2020-10-01T10:54:02.707724Z lvl=info msg="Migration \"create pkger stacks buckets\" started (up)" log_id=0P_xbFnW000 service=migrations
ts=2020-10-01T10:54:02.711072Z lvl=info msg="Migration \"create pkger stacks buckets\" completed (up)" log_id=0P_xbFnW000 service=migrations
ts=2020-10-01T10:54:02.711088Z lvl=info msg="Migration \"delete sessionsv1 bucket\" started (up)" log_id=0P_xbFnW000 service=migrations
ts=2020-10-01T10:54:02.726936Z lvl=info msg="Migration \"delete sessionsv1 bucket\" completed (up)" log_id=0P_xbFnW000 service=migrations
ts=2020-10-01T10:54:02.726967Z lvl=info msg="Migration \"Create TSM metadata buckets\" started (up)" log_id=0P_xbFnW000 service=migrations
ts=2020-10-01T10:54:02.729418Z lvl=info msg="Migration \"Create TSM metadata buckets\" completed (up)" log_id=0P_xbFnW000 service=migrations
ts=2020-10-01T10:54:02.733449Z lvl=debug msg="buckets find" log_id=0P_xbFnW000 store=new took=0.114ms
ts=2020-10-01T10:54:02.733472Z lvl=info msg="Checking InfluxDB metadata for prior version." log_id=0P_xbFnW000 bolt_path=/root/.influxdbv2/influxd.bolt
ts=2020-10-01T10:54:02.733479Z lvl=error msg="Missing metadata for bucket." log_id=0P_xbFnW000 bucket=ucv bucket_id=05e58e482154a001
ts=2020-10-01T10:54:02.733485Z lvl=error msg="Missing metadata for bucket." log_id=0P_xbFnW000 bucket=_tasks bucket_id=05e58e482154a002
ts=2020-10-01T10:54:02.733490Z lvl=error msg="Missing metadata for bucket." log_id=0P_xbFnW000 bucket=_monitoring bucket_id=05e58e482154a003
ts=2020-10-01T10:54:02.733496Z lvl=error msg="Missing metadata for bucket." log_id=0P_xbFnW000 bucket=influx_monitoring bucket_id=05f7272f1a14a000
ts=2020-10-01T10:54:02.733501Z lvl=error msg="Incompatible InfluxDB 2.0 metadata found. File must be moved before influxd will start." log_id=0P_xbFnW000 path=/root/.influxdbv2/influxd.bolt
ts=2020-10-01T10:54:02.733520Z lvl=error msg="Found directory that is incompatible with this version of InfluxDB." log_id=0P_xbFnW000 path=/root/.influxdbv2/engine/_series
ts=2020-10-01T10:54:02.733532Z lvl=error msg="Found directory that is incompatible with this version of InfluxDB." log_id=0P_xbFnW000 path=/root/.influxdbv2/engine/index
ts=2020-10-01T10:54:02.733543Z lvl=error msg="Incompatible InfluxDB 2.0 version found. Move all files outside of engine_path before influxd will start." log_id=0P_xbFnW000 engine_path=/root/.influxdbv2/engine

Even worse, I can't go back to 2.0.0-beta now :(

docker ps
CONTAINER ID        IMAGE                                      COMMAND                  CREATED              STATUS                          PORTS               NAMES
bcf31eb7d3e5        quay.io/influxdb/influxdb:2.0.0-beta       "/entrypoint.sh --re…"   About a minute ago   Restarting (1) 15 seconds ago                       influxdb

ts=2020-10-01T11:04:09.648684Z lvl=info msg="Welcome to InfluxDB" log_id=0P_yBIh0000 version=2.0.0-beta.12 commit=ff620782eb build_date=2020-10-01T11:04:09Z
ts=2020-10-01T11:04:09.651083Z lvl=info msg="Resources opened" log_id=0P_yBIh0000 service=bolt path=/root/.influxdbv2/influxd.bolt
ts=2020-10-01T11:04:09.660307Z lvl=error msg="Failed to initialize kv service" log_id=0P_yBIh0000 error="up: reading migrations: migration \"migrate task owner id\": migration specification not found"
up: reading migrations: migration "migrate task owner id": migration specification not found

Environment info:

Docker on Ubuntu 18.04

Config:

No custom config.

Logs:

See above

sgreszcz commented 4 years ago

So to recover, what files need to be removed/cleaned from the influxdb/engine path?

drwxr-xr-x  6 root     root       57 Jun 24 12:08 .
drwxr-xr-x  3 influxdb influxdb   40 Jun 24 12:08 ..
drwxr-xr-x  2 root     root     8192 Oct  1 10:39 data
drwxr-xr-x 10 root     root       78 Jun 24 12:08 index
drwxr-xr-x 10 root     root       86 Jun 24 12:08 _series
drwxr-xr-x  2 root     root       42 Oct  1 10:48 wal
sgreszcz commented 4 years ago

OK, looks like some large breaking changes going from V2-beta to V2-rc, however would have been nice to see some notes around that on the docs page, as many people will be pulling a new Docker image unawares. https://docs.influxdata.com/influxdb/v2.0/reference/rc0-upgrade-guide/

Also, this doesn't tell me why I can't roll back to the beta docker image after trying to start v2-rc. That is bigger problem if we can't go back to go forward properly.

Screenshot 2020-10-01 at 14 51 34
rickspencer3 commented 4 years ago

@russorat thoughts?

sgreszcz commented 4 years ago

Looks like the shared docker volume with file "influxd.bolt" gets altered somehow when starting a v2-rc image referencing the same shared volume. If you try to revert, some parameter in "influxdb.bolt" seems to prevent the startup of v2-beta:

ts=2020-10-01T14:07:01.766100Z lvl=info msg="Welcome to InfluxDB" log_id=0Pa7dzXW000 version=2.0.0-beta.12 commit=ff620782eb build_date=2020-10-01T14:07:01Z
ts=2020-10-01T14:07:01.768662Z lvl=info msg="Resources opened" log_id=0Pa7dzXW000 service=bolt path=/root/.influxdbv2/influxd.bolt
ts=2020-10-01T14:07:01.770235Z lvl=error msg="Failed to initialize kv service" log_id=0Pa7dzXW000 error="up: reading migrations: migration \"migrate task owner id\": migration specification not found"
up: reading migrations: migration "migrate task owner id": migration specification not found

When I started up v2-beta again with no "influxdb.bolt" it seemed to see the data in my wal and data folders, but none of the buckets and other configuration. Trying to reach host:9999 gave me the startup GUI (and my Telegraf feed obviously doesn't work as the bucket/token isn't there).

ts=2020-10-01T14:12:50.304725Z lvl=info msg="Welcome to InfluxDB" log_id=0Pa7zG00000 version=2.0.0-beta.12 commit=ff620782eb build_date=2020-10-01T14:12:50Z
ts=2020-10-01T14:12:50.314919Z lvl=info msg="Resources opened" log_id=0Pa7zG00000 service=bolt path=/root/.influxdbv2/influxd.bolt
ts=2020-10-01T14:12:50.315954Z lvl=info msg="Migration \"initial migration\" started (up)" log_id=0Pa7zG00000 store=kv
ts=2020-10-01T14:12:50.323086Z lvl=info msg="Migration \"initial migration\" completed (up)" log_id=0Pa7zG00000 store=kv
ts=2020-10-01T14:12:50.323137Z lvl=info msg="Migration \"add index \\\"userresourcemappingsbyuserindexv1\\\"\" started (up)" log_id=0Pa7zG00000 store=kv
ts=2020-10-01T14:12:50.327569Z lvl=info msg="Migration \"add index \\\"userresourcemappingsbyuserindexv1\\\"\" completed (up)" log_id=0Pa7zG00000 store=kv
ts=2020-10-01T14:12:50.345845Z lvl=info msg="Opening Series File (start)" log_id=0Pa7zG00000 service=storage-engine service=series-file op_name=series_file_open path=/root/.influxdbv2/engine/_series op_event=start
ts=2020-10-01T14:12:50.549658Z lvl=info msg="Opening Series File (end)" log_id=0Pa7zG00000 service=storage-engine service=series-file op_name=series_file_open path=/root/.influxdbv2/engine/_series op_event=end op_elapsed=203.818ms
ts=2020-10-01T14:12:50.624602Z lvl=info msg="Index opened" log_id=0Pa7zG00000 service=storage-engine index=tsi partitions=8
ts=2020-10-01T14:12:50.626407Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000015.tsm id=10 duration=0.908ms
ts=2020-10-01T14:12:50.626451Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000006.tsm id=1 duration=0.941ms
ts=2020-10-01T14:12:50.626462Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000008.tsm id=3 duration=1.045ms
ts=2020-10-01T14:12:50.626416Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000011.tsm id=6 duration=0.992ms
ts=2020-10-01T14:12:50.626498Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000010.tsm id=5 duration=1.018ms
ts=2020-10-01T14:12:50.626509Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000005.tsm id=0 duration=1.129ms
ts=2020-10-01T14:12:50.626646Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000009.tsm id=4 duration=1.203ms
ts=2020-10-01T14:12:50.627501Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000007.tsm id=2 duration=1.001ms
ts=2020-10-01T14:12:50.627578Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000026.tsm id=21 duration=1.038ms
ts=2020-10-01T14:12:50.627605Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000014.tsm id=9 duration=1.080ms
ts=2020-10-01T14:12:50.627679Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000017.tsm id=12 duration=1.111ms
ts=2020-10-01T14:12:50.627859Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000016.tsm id=11 duration=1.292ms
ts=2020-10-01T14:12:50.628599Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000019.tsm id=14 duration=1.857ms
ts=2020-10-01T14:12:50.628639Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000012.tsm id=7 duration=0.916ms
ts=2020-10-01T14:12:50.628735Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000013.tsm id=8 duration=1.106ms
ts=2020-10-01T14:12:50.628740Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000018.tsm id=13 duration=2.144ms
ts=2020-10-01T14:12:50.628966Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000021.tsm id=16 duration=1.312ms
ts=2020-10-01T14:12:50.629320Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000020.tsm id=15 duration=1.781ms
ts=2020-10-01T14:12:50.629608Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000031.tsm id=26 duration=0.759ms
ts=2020-10-01T14:12:50.629638Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007089-000000001.tsm id=32 duration=4.035ms
ts=2020-10-01T14:12:50.629837Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000022.tsm id=17 duration=1.925ms
ts=2020-10-01T14:12:50.630035Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000025.tsm id=20 duration=1.395ms
ts=2020-10-01T14:12:50.630413Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000023.tsm id=18 duration=1.635ms
ts=2020-10-01T14:12:50.630459Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000028.tsm id=23 duration=1.104ms
ts=2020-10-01T14:12:50.630485Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000024.tsm id=19 duration=1.819ms
ts=2020-10-01T14:12:50.630583Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000027.tsm id=22 duration=1.556ms
ts=2020-10-01T14:12:50.631257Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000029.tsm id=24 duration=1.562ms
ts=2020-10-01T14:12:50.638439Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007088-000000001.tsm id=31 duration=7.906ms
ts=2020-10-01T14:12:50.639948Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007076-000000030.tsm id=25 duration=10.275ms
ts=2020-10-01T14:12:50.642425Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007084-000000002.tsm id=27 duration=12.360ms
ts=2020-10-01T14:12:50.642574Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007085-000000001.tsm id=28 duration=12.120ms
ts=2020-10-01T14:12:50.642568Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007086-000000001.tsm id=29 duration=12.652ms
ts=2020-10-01T14:12:50.643388Z lvl=info msg="Opened file" log_id=0Pa7zG00000 service=storage-engine engine=tsm1 service=filestore path=/root/.influxdbv2/engine/data/000000000007087-000000001.tsm id=30 duration=12.904ms
ts=2020-10-01T14:12:50.643578Z lvl=info msg="Reading file" log_id=0Pa7zG00000 service=storage-engine path=/root/.influxdbv2/engine/wal/_23125.wal size=10509389
ts=2020-10-01T14:12:53.056720Z lvl=info msg="Reading file" log_id=0Pa7zG00000 service=storage-engine path=/root/.influxdbv2/engine/wal/_23126.wal size=6884591
ts=2020-10-01T14:12:54.624570Z lvl=info msg="Reloaded WAL" log_id=0Pa7zG00000 service=storage-engine path=/root/.influxdbv2/engine/wal duration=3981.029ms
ts=2020-10-01T14:12:54.624664Z lvl=info msg=Starting log_id=0Pa7zG00000 service=storage-engine component=retention_enforcer check_interval=1h
ts=2020-10-01T14:12:54.625157Z lvl=info msg="Starting query controller" log_id=0Pa7zG00000 service=storage-reads concurrency_quota=10 initial_memory_bytes_quota_per_query=9223372036854775807 memory_bytes_quota_per_query=9223372036854775807 max_memory_bytes=0 queue_size=10
ts=2020-10-01T14:12:54.938636Z lvl=info msg=Listening log_id=0Pa7zG00000 transport=http addr=:9999 port=9999
ts=2020-10-01T14:13:00.019458Z lvl=info msg=Unauthorized log_id=0Pa7zG00000 error="authorization not found"
ts=2020-10-01T14:13:00.020070Z lvl=debug msg=Request log_id=0Pa7zG00000 service=http method=POST host=db-alln-002:9999 path=/api/v2/write query="bucket=ucv&org=meetingx" proto=HTTP/1.1 status_code=401 response_size=55 content_length=-1 referrer= remote=173.36.65.181:45708 user_agent=Telegraf took=0.974ms error=unauthorized error_code=unauthorized
ts=2020-10-01T14:13:10.018377Z lvl=info msg=Unauthorized log_id=0Pa7zG00000 error="authorization not found"
ts=2020-10-01T14:13:10.018554Z lvl=debug msg=Request log_id=0Pa7zG00000 service=http method=POST host=db-alln-002:9999 path=/api/v2/write query="bucket=ucv&org=meetingx" proto=HTTP/1.1 status_code=401 response_size=55 content_length=-1 referrer= remote=173.36.65.181:45708 user_agent=Telegraf took=0.331ms error=unauthorized error_code=unauthorized
Screenshot 2020-10-01 at 15 17 11
russorat commented 4 years ago

@sgreszcz sorry you’re having trouble here. We tried to add warnings everywhere we could about the breaking changes.

For the error you see in bolt, there is a section in the docs about removing the offending migration from the boltdb file: https://docs.influxdata.com/influxdb/v2.0/reference/rc0-upgrade-guide/#5-start-old-influxdb-beta-instance

sgreszcz commented 4 years ago

@sgreszcz sorry you’re having trouble here. We tried to add warnings everywhere we could about the breaking changes.

For the error you see in bolt, there is a section in the docs about removing the offending migration from the boltdb file: https://docs.influxdata.com/influxdb/v2.0/reference/rc0-upgrade-guide/#5-start-old-influxdb-beta-instance

Aaaand we're back 👍 Thanks for that hint, for anyone looking the boltbrowser tool is here: https://github.com/br0xen/boltbrowser

russorat commented 4 years ago

Thanks for the update! Can I ask how you found the docker rc image so we can add a warning there if we can? We will add the link to boltbrowser to the docs as well.

sgreszcz commented 4 years ago

On InfluxDB "getting started" docs: https://docs.influxdata.com/influxdb/v2.0/get-started/#start-with-influxdb-oss

Thanks for the help with this, and at least maybe others who trip up can find their way with this Github issue!

russorat commented 4 years ago

thanks! i opened : https://github.com/influxdata/docs-v2/issues/1579 and fixed the link: https://github.com/influxdata/docs-v2/commit/a78bfd9230fbcfc2e304ae239fa0f59def2b6d28