influxdata / influxdb

Scalable datastore for metrics, events, and real-time analytics
https://influxdata.com
Apache License 2.0
28.51k stars 3.53k forks source link

[bug] drop/delete terribly slow #9636

Open lukaszgryglicki opened 6 years ago

lukaszgryglicki commented 6 years ago

Bug report

Influx version: v1.5.1 OS version: Ubuntu Linux 17.04

Steps to reproduce:

  1. Have a big database with > 20000 series
  2. Try to execute drop series from /{{regexp_pattern}}/.
  3. or try to execute drop measurement /{{regexp_pattern}}/.
  4. or try to execute delete from /{{regexp_pattern}}/.

It will take > 30 minutes.

Expected behavior:

  1. Well deleting data should take miliseconds?
  2. Or there should be some special command modifier that will do it fast (I think this is becaus esome influx DB cleaning stuff). And there should be another comment to execute this "long" part manually, so for example I can delete ton of series and run cleanup process manually (to reclaim space) from a cron job or something like this.
mark-rushakoff commented 6 years ago

It definitely shouldn't be taking that long.

Can you gather some profiles while the slow delete is in progress?

curl -o profiles.tar.gz "http://localhost:8086/debug/pprof/all?cpu=true"

curl -o vars.txt "http://localhost:8086/debug/vars"
iostat -xd 1 30 > iostat.txt

The workaround I would try is to break out the regular expression, like running SHOW MEASUREMENTS WITH MEASUREMENT =~ /pattern/, and then make individual DROP calls in one statement, like DROP MEASUREMENT match1; DROP MEASUREMENT match2; DROP MEASUREMENT match3....

If you run the DROP calls as separate statements, you may trigger a compaction on each run.

lukaszgryglicki commented 6 years ago

I will try to provide the details You need tomorrow. But dropping series one by one is actually a lot faster, than single drop with regexp, but still dropping any single series takes tens of seconds.

Can I somehow turn compaction off? My actual / current solution (implemeneted toay) is code reorganization in such a way that drop is no longer needed, I'm using time value as a kay, so no need to drop. Adding a point will either add new point or overwrite existing one.

I also have a problem with influx db copy program (that I've. written myself months ago). It now takes about 3-10x times more than it was taking on v1.3.x.

I'll be investigating this tomorrow, for now the 3x-10x more time is due to mutiple "timeout error" that I'm receiving and retrying (my copy program deals with this).

I suspect the problem is due to the number of threads - I'm using 48 threads and http-timeout = 300s. And just after about 300s of data copying I'm staring to receive "timeout error" Maybe the influxd server is not allowing 48 threads at once? But I cannot check this now, because very time consuming process is runniung, and it will take hours to finish.

Is there a way to tell influxd to allow unlimited number of connections or say at least 100? What is the default?

This the influx db backup tool. It can copy one influx database to another local/remote etc: https://github.com/cncf/devstats/blob/master/cmd/idb_backup/idb_backup.go

I need such a tool becausewhen I'm doing a full data regenerate, I need to save it to the temp database first (it takes hours), not to disturb original database which is used by Grafana. Once this is completed successfully I'm copying temp database to the one used by Grafana (which is safer and gives less downtime) and finall drop the temp database.

The better option would be to generate temp db and the just rename it, but unfotunatelly Influx db is not supporting this too :-(

lukaszgryglicki commented 6 years ago

Uploaded requested profiles: profiles.tar.gz iostat.txt vars.txt

While running delete from /^open_(issues|prs)_sigs_milestones/ - it already takes 4+ minutes.

lukaszgryglicki commented 6 years ago

Delete still running (already 19 minutes) it was at least 40x faster in v1.3.x. Requested stats after about 17minutes running delete: iostat.txt profiles.tar.gz vars.txt

mark-rushakoff commented 6 years ago

Can I somehow turn compaction off?

I don't believe that is supported. Maybe setting max-concurrent-compactions=1 would suffice here. https://docs.influxdata.com/influxdb/v1.5/administration/config/#max-concurrent-compactions-0

Adding a point will either add new point or overwrite existing one.

Overwriting existing points is expensive and should generally be avoided if possible; otherwise it should happen in bulk if possible.

I also have a problem with influx db copy program (that I've. written myself months ago).

I suspect you'll get much better performance with using the native backup and restore tools.

Is there a way to tell influxd to allow unlimited number of connections or say at least 100? What is the default?

Default should be unlimited: https://docs.influxdata.com/influxdb/v1.5/administration/config/#max-connection-limit-0

Uploaded requested profiles:

Thanks, this will be very helpful in identifying the performance bottleneck.

@rbetts care to prioritize/assign this?

rbetts commented 6 years ago

@lukaszgryglicki Thanks for the very complete report. We'll see triage this in an upcoming grooming session.

lukaszgryglicki commented 6 years ago

BTW: I've killed the query after 40 minutes and implemented alternate solution:

https://github.com/cncf/devstats/blob/master/cmd/z2influx/z2influx.go

lukaszgryglicki commented 6 years ago

I suspect you'll get much better performance with using the native backup and restore tools.: BTW: is there a native program to copy one influx DB into another, or rename? There wasn't such a program 3 months ago, I'm quite sure, because I was researching this for more than week some time ago?

Overwriting existing points is expensive and should generally be avoided if possible; otherwise it should happen in bulk if possible.: All my programs use bulk writing.

mark-rushakoff commented 6 years ago

BTW: is there a native program to copy one influx DB into another, or rename?

https://docs.influxdata.com/influxdb/v1.5/administration/backup_and_restore/

lukaszgryglicki commented 6 years ago

I'll double check tomorrow then but a quick question: Can I backup database named A into some file and then restore into database named B? What I need to do

Copy A_temp to A should be orders of magnitude faster than just generating A (to avoid downtime).

Is this feature new in v1.5 ? There already were my issues for database copying reported here few months ago and my solutions to that.

lukaszgryglicki commented 6 years ago

Maybe -newdb option of influxd will be useful for creating a copy of a database, but I see that I need to stop influxd to restore database. This is not a good solution, because influxd server more than 20 databases, so this is not a good idea to stop it just to copy a database. Why do I need to stop influxd when restoring on a new (not yet existing) fresh database? My custom written idb_backup program does full copy & restore of influx database without stopping influxd and it was quite fast in v1.3.x and now is a lot slower in v1.5.0 but still usable. What is interesting: it seems like its performance drops when series number increase. And it looks like a quadric dependency... So when I have > 10000 series it becomes slow, but for 50000 series it is awful!

lukaszgryglicki commented 6 years ago

Seems like regexps are also very slow. There is only one series name which match /^prs_labelled_d$/ and it is named "prs_labelled_d". When I do:

pol commented 6 years ago

I am having a similar issue where I need to delete lots of measurements (malformed graphite template led to 100k measurements being created). Deleting an individual measurement takes between 20s and 2m when it finishes at all (sometimes it just hangs).

I am unable to drop by regex (I get an error: ERR: error parsing query: found /, expected identifier at line 1, char 18).

The system is reasonably loaded, using 50% of system ram and ~70% load.

This seems entirely related to the number of measurements, not the amount of data. These operations yesterday were fast (less than 5 seconds, but not instant) when I had 50 measurements. I am not writing any data to the new bugged measurements, so my data has not grown much, but the number of measurements has. I went from ~50 measurements to ~110000. Since they are raw graphite items, their patterns are easy to regex, but if I was to delete the measurements individually it would take weeks.

Requesting the debug profiles using the command listed above takes 30s to start (the curl stats show no activity for 30s, then the download happens). This feels like the kind of latency that is happening during other operations (it just feels very slow and laggy). Due to that (and the lack of regex drop), I can't dump all three stats during the same measurement delete operation, so i just start the next one as soon as I can. As a note, the vars download is not slow.

Here is an archive of my profile data:

influx-profile.tar.gz

System Info (aws ec2 m5.large):

pol@influxdb-a-1 ~ % influx -version
InfluxDB shell version: 1.5.2
x86_64 x86_64 GNU/Linux
pol@influxdb-a-1 ~ % cat /etc/issue
Ubuntu 16.04.3 LTS \n \l
pol@influxdb-a-1 ~ % uname -a
Linux influxdb-a-1 4.4.0-1060-aws #69-Ubuntu SMP Sun May 20 13:42:07 UTC 2018 x86_64 
lukaszgryglicki commented 6 years ago

We've replaced influx with PostgreSQL and delete is instant. Postgres (which is a surprise for me) is faster not only in delete operation, but in all operations basically.

pol commented 6 years ago

This is kind of drifting off-topic, but how do you deal with the dynamic schema benefits of influxdb? Do you just not index the tags and store all measurements as strings? We have a fairly complex pile of pre-existing metrics and developing a sql schema for all of them in a reasonable way would be very difficult (unless we just crudely stored them as txt or something); the benefit of influxdb (and other storage systems like it) is that it is specific to the problem of metrics storage and querying (as opposed to general purpose databasing).

lukaszgryglicki commented 6 years ago

I've implemented something that dynamically creates tables/columns/indices as needed. Took more than week, but it works great and is faster.

jacobmarble commented 6 years ago

I can reproduce this bug, but not consistently. Created a database with 100000 measurements and about 8 million series. I have twice seen delete from /regex/ behave as though it were completely locked up.

Delete from 1 measurement:

$ time influx -database stress -execute "delete from m99999"

real    0m0.084s
user    0m0.020s
sys     0m0.022s

Delete from about 9 measurements:

$ time influx -database stress -execute "delete from /m9999./"

real    0m3.773s
user    0m0.024s
sys     0m0.022s

Delete from about 90 measurements:

$ time influx -database stress -execute "delete from /m999.*/"

real    0m3.106s
user    0m0.019s
sys     0m0.018s

Delete from about 900 measurements (gave up, ctrl-C after ~5m):

$ time influx -database stress -execute "delete from /m99.*/"
^C

Adding some debug logging to tsdb/store.go I can see that there is a deadlock. This also causes shutdown to timeout, and failover to a hard shutdown.

jacobmarble commented 6 years ago

The deadlock is on partition.go, Partition.Wait() which simply calls i.wg.Wait().

jacobmarble commented 6 years ago

screen shot 2018-07-03 at 5 36 48 pm

e-dard commented 6 years ago

Opening as there is still possibly a performance issue.

jacobmarble commented 6 years ago

Related to #10056

jacobmarble commented 6 years ago
master
$ time influx -database stress -execute 'delete from /m9.*/'

real    0m31.381s
user    0m0.021s
sys     0m0.021s

1.6
$ time influx -database stress -execute 'delete from /m9.*/'

real    0m29.174s
user    0m0.023s
sys     0m0.033s

1.5
$ time influx -database stress -execute 'delete from /m9.*/'

real    0m22.821s
user    0m0.022s
sys     0m0.029s

1.4 (inmem)
$ time influx -database stress -execute 'delete from /m9.*/'

real    0m0.385s
user    0m0.023s
sys     0m0.030s

1.3 (inmem)
$ time influx -database stress -execute 'delete from /m8.*/'

real    0m0.413s
user    0m0.020s
sys     0m0.023s
JulienChampseix commented 5 years ago

@jacobmarble do you have an update about this investigation/bug fix ?

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

e-dard commented 5 years ago

don't close.

PennyYip commented 4 years ago

I am using 1.7.7 and the issue still persists.

Deleting a single measurement with 1mio series takes forever.

iostat shows disk is not of a very high utilization


Device            r/s     w/s     rkB/s     wkB/s   rrqm/s   wrqm/s  %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util

loop0            0.00    0.00      0.00      0.00     0.00     0.00   0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
nvme0n1          0.00  723.00      0.00   3046.50     0.00   163.00   0.00  18.40    0.00    1.62   0.07     0.00     4.21   0.05   3.60
nvme1n1          0.00  723.00      0.00   3046.50     0.00   163.00   0.00  18.40    0.00    1.84   0.10     0.00     4.21   0.07   5.20
md1              0.00    0.00      0.00      0.00     0.00     0.00   0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
md2              0.00  761.00      0.00   3044.00     0.00     0.00   0.00   0.00    0.00    0.00   0.00     0.00     4.00   0.00   0.00
md0              0.00    0.00      0.00      0.00     0.00     0.00   0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
pohmelie commented 4 years ago

Still present in 1.7.8.
10_000 measurements with delete query delete where time < '2018-01-01' took 25 seconds.

bryanspears commented 4 years ago

It's not just the delete that is slow, but the garbage collection and/or compaction that happens after the delete is terrible. I deleted a couple hundred measurements and the load jumped 6x for hour+. Memory maxed. IO wait is 50% or so of all CPU time (cloud resources!).

jacobmarble commented 4 years ago

If I remember correctly, we decided in 1.5 to block on delete, rather than accept the delete request and handle it asynchronously. My earlier analysis probably reflects this. So "that's a feature, not a bug" is my offhand comment for that.

@bryanspears have you tried limiting concurrent compactions? Where your bulk delete operation likely touched several shards, and those shards all require multiple levels of compaction, limiting concurrency has helped other folks in the past. Start with limiting to 1. https://docs.influxdata.com/influxdb/v1.7/administration/config/#max-concurrent-compactions-0

jacobmarble commented 4 years ago

Possibly related to #15271

bryanspears commented 4 years ago

@jacobmarble your suggestion has stabilized our small influx setup. For anyone else using limited I/O, cloud-based resources. These configuration options helped dramatically. Compaction does take longer, but that's better than the alternative crash that was occurring for us.

max-concurrent-compactions = 1
  wal-fsync-delay = "50ms"
  compact-throughput = "10m"
  compact-throughput-burst = "100m"
jacobmarble commented 4 years ago

@bryanspears I'm glad to know that limiting concurrent compactions helped. In many situations, compaction duration doesn't cost anything, and slower compactions free resources for queries and writes. For anyone else reading, when dealing with this sort of write-vs-read contention, I suggest

Be aware that setting wal-fsync-delay is often helpful for constrained I/O situations (fewer calls to fsync()), but it also risks losing the most recently written data (up to 50ms in the example) when power is lost. For many use cases, this is acceptable.

Another way to help WAL I/O is to write in batches, say 5000 points per batch. This way, you can let it fsync() with every call to the /write endpoint (wal-fsync-delay=0), since there are fewer such calls.

lazzarello commented 4 years ago

This issue was the first hit for a search about slow drop database operations. I'm on 1.7.9 and witnessed the same performance. I thought my network connection had dropped so after an hour I sent ctrl + c and it returned to the influx shell. So not the network. The database I wish to drop shows up in show databases but when I use it and show measurements I get the following output ERR: engine is closed.

I'm optimizing for disk space, so I very much would like to delete this temporary database.

yozik04 commented 4 years ago

I am cleaning up some invalid values from my measurement in python code. Table has data for a year with probably one minute resolution.

DELETE FROM Cool_Return_Temp WHERE time = $time;

Query is tremendously slow. i am thinking if it is possible to pass an array of times to make index rebuild only once.

Unfortunately next query was not working last time I tried.

DELETE FROM Cool_Return_Temp WHERE value > 50;
bapBardas commented 3 years ago

Experiencing the same difficulties here with InfluxDB v1.8, it takes more than 25min to drop 1 measurement where data is stored over 3 shards of 4 weeks duration. I have 42k measurements to delete it's not acceptable for our production platform, i'm gonna need to find a trick.

lukaszgryglicki commented 3 years ago

Wow, over 3 years passed since I've reported this and it's not fixed...

cheeseandcereal commented 3 years ago

Not sure how related, but I've got a measurement with a single point on influxdb 1.8.5 and when I run DROP MEASUREMENT <name>, the operation just hangs forever. Not sure how to further debug. Rebooting influxdb does not seem to help

jalesingh commented 3 years ago

I am also getting the same issue, even I have only 7-8 measurements and none of measurement have more than 1500 records in it. version: CLI: Influx CLI 2.0.6 (git: 4db98b4c9a) build_date: 2021-04-29T16:48:12Z DB:Version 2.0.5 ('7c3ead)

Delete command : curl -X POST \ 'http://127.0.0.1:8086/api/v2/delete?bucket=intralogistics&org=greyorange' \ -H 'authorization: Token hzCzxEPyoXUcnJOkGQDuYzlMIQZ06jUKP8De-HKSQU5hmgXAAWevPXuZcrduV9OTIl-8phT2-ZFyd-sqMOeLSA==' \ -H 'content-type: application/json' \ -d '{ "start":"2020-06-20T00:00:00.000000Z", "stop":"2021-06-20T23:59:59.999999Z", "predicate": "_measurement=\"service_request\" AND external_service_request_id=\"order_234_20210618135243\"" }'

iostat.txt vars.txt profiles.tar.gz

I am at the default configuration so please let me know if I need to do set some configurations

NicolasGoeddel commented 3 years ago

We also want to drop measurements of hosts which do not exist anymore on influxdb v1.8.5. But after dropping a few of them the database just locks up and does not accept further drop statements. Then we have to wait up to a day until it is available again.

Would be nice to get this finally solved.

Also does someone have a clue if this issue persists in version 2.x?

msherman13 commented 2 years ago

experiencing this issue as well. we had a problem a while back which caused a measurement to be created with about 300k series. the new measurement is small, only about 1m rows, but trying to drop the measurement hangs forever. the cardinality of the database is blown up now (before this measurement, it was around 20k) and causing all kinds of problems.

i'm not even able to successfully drop any series from this measurement now. for example, a drop series query filtering on just 10 series hangs forever. nothing useful in the logs.

we're on v1.8.6 and TSI index. this issue is also causing our db to take about an hour to startup...

any idea here?

robertsLando commented 1 year ago

Having this issue too, influx 1.8.10. In my case I'm using a query like:

DROP SERIES from /.*/ WHERE tag=test