influxdata / influxdb

Scalable datastore for metrics, events, and real-time analytics
https://influxdata.com
Apache License 2.0
28.58k stars 3.53k forks source link

[0.10.3] iostat write 100%, large io write #6050

Closed ghost closed 8 years ago

ghost commented 8 years ago

Thank you for an amazing product.

I have a problem with version 0.10.3, or I don't understand what I am doing ... I get 100% disk usage, and from /var/log/influxdb/influxd.log

[http] 2016/03/16 16:14:52 127.0.0.1 - test [16/Mar/2016:16:14:47 +0100] POST /write?db=test&p=%5BREDACTED%5D&precision=s&u=test HTTP/1.1 204 0 - - d2690cc8-eb89-11e5-ac60-000000000000 9.330683451s

after some time i get: 502 error

I have ~10000 points/s ,

How to configure InfluxDB to use more RAM to cache before write to disk ??

UPDATE: I was thinking that cache is too small so boost some settings:

cache-max-memory-size = 52428800000
cache-snapshot-memory-size = 26214400000

but I don't see any significant changes,

18:09:17          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
18:09:18       dev8-0    113.00      0.00   2392.00     21.17      0.92      8.18      8.04     90.80
18:09:18      dev8-16    111.00      0.00   2392.00     21.55      0.91      7.96      8.18     90.80
18:09:18       dev9-3      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
18:09:18       dev9-2    259.00      0.00   2392.00      9.24      0.00      0.00      0.00      0.00
jwilder commented 8 years ago

@umbri Are you using SSD or spinning disk? Those cache memory settings shouldn't be so high. I'd leave them at the default.

ghost commented 8 years ago
=== START OF INFORMATION SECTION ===
Model Family:     Hitachi/HGST Ultrastar 7K4000
Device Model:     HGST HUS724020ALA640
Serial Number:    PN2181P5H0YLTX
LU WWN Device Id: 5 000cca 24ece86a4
Firmware Version: MF6OAA70
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

spinning disk I have read best performance you have suggested but ... i have 128GB o RAM that I want to use at maximum, no HDD, so I try to configure using RAM not HDD, (if it is possible)

/dev/md3:
        Version : 0.90
     Raid Level : raid1
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 3
    Persistence : Superblock is persistent
jwilder commented 8 years ago

It looks like you are maxing out the IOPS for that single drive based on your write load. SSDs will give you better performance in general. If you can't use SSDs, you may need to use multiple drives in a RAID configuration to increase your available IOPS.

The kernel will use free memory for the file system cache when possible. Increasing the cache in the DB ends up wasting RAM because the same data is cached twice in memory. It can also cause other problems in different parts of the system if it's not sized appropriately so increasing the cache settings won't reduce your IOPS needs.

If you want to completely eliminate your disks, you could setup a RAM disk and change your config to write there. That is obviously not recommended if you don't want to lose data though.

ghost commented 8 years ago

Can I read somewhere about how influxdb is writing, caching, data so I understand how to configure my system. ? I specially configure my server to have much RAM, I use REDIS (high loaded), so I think I can configure influxDB to be like redis.

Can you suggest me something about influxDB config that can help me ?? I think RAM disk is a great idea. (I will try ..)

P.S: Thank you for an amazing product. P.S: Sorry for my English.

jwilder commented 8 years ago

I would suggest asking this question on the mailing list.

For docs you could take a look at:

ghost commented 8 years ago

I just create a new RAM disk and configure influxdb to use that disk, this solve all problems. How about persistence, if I will copy all that influxdb is writing to ramdisk somewhere on persistent disk, lets say every 10 minutes, in case of fault, I will drop only this 10 minutes or it depends on influxdb config ? And if depends what I need to configure so this DROP to be small ??