개요

2024-07-31T16:51:12.276+09:00 DEBUG 28465 --- [nio-8080-exec-4] org.hibernate.SQL                        : 
    select
        ct1_0.id,
        ct1_0.created_at,
        ct1_0.crypto_exchange,
        ct1_0.close,
        ct1_0.symbol 
    from
        crypto_ticker ct1_0 
    where
        mod(ct1_0.created_at, ?)=? 
        and ct1_0.symbol=? 
        and ct1_0.crypto_exchange=? 
        and ct1_0.created_at<=? 
    order by
        ct1_0.created_at desc 
    limit
        ?

현재까지 수집한 거래소별 스테이블 코인 가격 데이터를 모아서 영역 차트 데이터 API를 개발했습니다. 데이터가 15분 간격으로 쌓이면서 현재 10만 개 이상의 데이터가 축적되고 있으며, 이로 인해 응답 시간이 점점 길어지고 있어 개선을 시도하려고 합니다.

테스트 환경

AWS EC2 t2.small MySQL Ver 8.0.36 for Linux on x86_64 (MySQL Community Server - GPL)

튜닝 전 쿼리 성능 테스트

[ec2-user@ip-10-0-3-212 ~]$ sudo mysqlslap --user=test --password --host=localhost --port=3306 --concurrency=10 --iterations=10 --create-schema=prod 
--query=/home/ec2-user/query.sql
Enter password: 
Benchmark
        Average number of seconds to run all queries: 0.624 seconds
        Minimum number of seconds to run all queries: 0.608 seconds
        Maximum number of seconds to run all queries: 0.655 seconds
        Number of clients running queries: 10
        Average number of queries per client: 1

select_type	type	rows	filtered	Extra
SIMPLE	ALL	111896	0.33	Using where; Using filesort

성능 테스트를 위해 mysqlslap을 사용했으며, 10개의 동시 연결 상태에서 10회 반복하여 테스트를 진행했습니다.

현재 쿼리는 10만 개의 데이터를 풀 스캔하고 있으며, 평균 응답 시간이 0.6초 걸리는 것을 확인할 수 있었습니다.

단일 인덱스 적용

ALTER TABLE crypto_ticker
    ADD INDEX crypto_ticker_index (created_at);

[ec2-user@ip-10-0-3-212 ~]$ sudo mysqlslap --user=test --password --host=localhost --port=3306 --concurrency=10 --iterations=10 --create-schema=prod --query=/home/ec2-user/query.sql
Enter password: 
Benchmark
        Average number of seconds to run all queries: 0.182 seconds
        Minimum number of seconds to run all queries: 0.159 seconds
        Maximum number of seconds to run all queries: 0.208 seconds
        Number of clients running queries: 10
        Average number of queries per client: 1

select_type	type	rows	filtered	Extra
SIMPLE	range	55957	1	Using index condition; Using where; Backward index scan

단순히 인덱스 하나만으로 성능이 많이 개선되었습니다.

추가적으로 Extra 컬럼에 Using where;이 표기되어 있는 것을 확인할 수 있고, rows가 전체 데이터의 50%로 불필요하게 많은 데이터를 참조하고 있는 것을 확인할 수 있습니다.

여기서 created_at, symbol, crypto_exchange를 복합 인덱스로 만들면, 스토리지 엔진에서 필터링된 조건만 MySQL 엔진에 전달하여 성능이 개선될 것으로 예상했습니다.

복합 인덱스 적용

ALTER TABLE crypto_ticker
    ADD INDEX crypto_ticker_index (created_at,symbol, crypto_exchange);

[ec2-user@ip-10-0-3-212 ~]$ sudo mysqlslap --user=test --password --host=localhost --port=3306 --concurrency=10 --iterations=10 --create-schema=prod --query=/home/ec2-user/query.sql
Enter password: 
Benchmark
        Average number of seconds to run all queries: 0.103 seconds
        Minimum number of seconds to run all queries: 0.098 seconds
        Maximum number of seconds to run all queries: 0.131 seconds
        Number of clients running queries: 10
        Average number of queries per client: 1

select_type	type	rows	filtered	Extra
SIMPLE	range	55957	1	Using index condition; Backward index scan

복합 인덱스를 구성할 때 카디널리티가 높은 created_at, symbol, crypto_exchange 순으로 구성 후 테스트를 진행했습니다.

복합 인덱스를 구성했을 때 단일 인덱스보다 성능이 개선된 것을 확인할 수 있었으며, 추가적으로 Extra 컬럼에서 Backward index scan;을 개선하기 위해 crypto_exchange 인덱스에 desc를 붙이고 다시 테스트를 진행했습니다.

Descending index 적용

ALTER TABLE crypto_ticker
    ADD INDEX crypto_ticker_index (created_at desc, symbol, crypto_exchange);

[ec2-user@ip-10-0-3-212 ~]$ sudo mysqlslap --user=test --password --host=localhost --port=3306 --concurrency=10 --iterations=10 --create-schema=prod --query=/home/ec2-user/query.sql
Enter password: 
Benchmark
        Average number of seconds to run all queries: 0.314 seconds
        Minimum number of seconds to run all queries: 0.290 seconds
        Maximum number of seconds to run all queries: 0.387 seconds
        Number of clients running queries: 10
        Average number of queries per client: 1

select_type	type	rows	filtered	Extra
SIMPLE	ALL	112118	0.5	Using where

결과는 예상과 달리 풀 스캔이 진행되었습니다. 예상과 다른 결과가 나온 이유를 고민해 본 결과, 다음과 같은 문제가 있었습니다.

복합 인덱스에서 범위 조건이 적용될 경우, 해당 컬럼은 인덱스를 사용하지만 그 뒤의 인덱스 컬럼들은 인덱스를 사용하지 않습니다. 이로 인해 created_at, symbol, crypto_exchange 인덱스에서 created_at은 인덱스를 사용했지만 symbol과 crypto_exchange는 인덱스를 사용하지 않았습니다.

이런 상황에서 created_at의 인덱스가 내림차순으로 정렬(과거 데이터부터 최신 데이터 순으로 정렬)되어 있었기 때문에, 가장 마지막에 있는 최신 데이터를 가져오기 위해 옵티마이저가 풀 스캔이 효율적이라고 판단한 것 같습니다. 실제로 과거에 가까운 데이터를 조회했을 때는 인덱스를 사용했습니다.

해당문제를 해결하기 위해서 인덱스 순서를 수정했습니다.

결과

ALTER TABLE crypto_ticker
    ADD INDEX crypto_ticker_index (symbol, crypto_exchange, created_at desc);

[ec2-user@ip-10-0-3-212 ~]$ sudo mysqlslap --user=thdwoqor --password --host=localhost --port=3306 --concurrency=10 --iterations=10 --create-schema=prod --query=/home/ec2-user/query.sql
Enter password: 
Benchmark
        Average number of seconds to run all queries: 0.047 seconds
        Minimum number of seconds to run all queries: 0.046 seconds
        Maximum number of seconds to run all queries: 0.052 seconds
        Number of clients running queries: 10
        Average number of queries per client: 1

select_type	type	rows	filtered	Extra
SIMPLE	range	11462	100	Using index condition;

결과적으로 영역 차트 데이터 조회 요청 시 평균 응답 시간을 608ms에서 46ms로 단축했습니다. 추가적으로, 차트 데이터의 특성상 캐싱을 도입하여 성능을 더욱 향상시킬 수 있을 것 같습니다.

thdwoqor / stable-coin-checker

영역 차트 데이터 쿼리 튜닝 #10

개요

테스트 환경

튜닝 전 쿼리 성능 테스트

단일 인덱스 적용

복합 인덱스 적용

Descending index 적용

결과