intel-cloud / cosbench

a benchmark tool for cloud object storage service
Other
573 stars 242 forks source link

S3 storage write timeout with data has been stored successfully in ceph #354

Closed cfanz closed 7 years ago

cfanz commented 7 years ago

Hi guys: I'm facing a strange problem, where cosbench always complain timeout when execute s3 write operation. My frontend is Nginx. Using s3cmd can perfectly put/get files but Cosbench will rise a timeout exception when the default 30 seconds passed, like below:

2017-06-28 15:52:53,110 [INFO] [Log4jLogManager] - will append log to file /root/0.4.1.0/log/mission/M2EDB04D32F.log
2017-06-28 15:52:53,440 [INFO] [NoneStorage] - performing PUT at /test5/AA2
2017-06-28 15:52:53,436 [INFO] [NoneStorage] - performing PUT at /test5/AA2
2017-06-28 15:52:53,442 [INFO] [NoneStorage] - performing PUT at /test5/AA6
2017-06-28 15:52:53,442 [INFO] [NoneStorage] - performing PUT at /test5/AA4
2017-06-28 15:53:43,641 [ERROR] [AbstractOperator] - worker 4 fail to perform operation test5/AA2
com.intel.cosbench.api.storage.StorageException: com.amazonaws.AmazonClientException: Encountered an exception and couldn't reset the stream to retry
        at com.intel.cosbench.api.S3Stor.S3Storage.createObject(S3Storage.java:126)
        at com.intel.cosbench.driver.operator.Writer.doWrite(Writer.java:98)
        at com.intel.cosbench.driver.operator.Writer.operate(Writer.java:79)
        at com.intel.cosbench.driver.operator.AbstractOperator.operate(AbstractOperator.java:76)
        at com.intel.cosbench.driver.agent.WorkAgent.performOperation(WorkAgent.java:197)
        at com.intel.cosbench.driver.agent.WorkAgent.doWork(WorkAgent.java:177)
        at com.intel.cosbench.driver.agent.WorkAgent.execute(WorkAgent.java:134)
        at com.intel.cosbench.driver.agent.AbstractAgent.call(AbstractAgent.java:44)
        at com.intel.cosbench.driver.agent.AbstractAgent.call(AbstractAgent.java:1)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: com.amazonaws.AmazonClientException: Encountered an exception and couldn't reset the stream to retry
        at com.amazonaws.http.AmazonHttpClient.resetRequestAfterError(AmazonHttpClient.java:400)
        at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:356)
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:190)
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:2974)
        at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1149)
        at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1014)
        at com.intel.cosbench.api.S3Stor.S3Storage.createObject(S3Storage.java:124)
        ... 12 more
Caused by: java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:153)
        at java.net.SocketInputStream.read(SocketInputStream.java:122)
        at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:149)
        at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:110)
        at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:260)
        at org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:98)
        at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:252)
        at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:281)
        at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:247)
        at org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:219)
        at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:298)
        at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
        at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:633)
        at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:454)
        at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
        at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754)
        at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:732)
        at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:306)
        ... 17 more

But the rgw log says it has returned http_status=200 for all those write operations with 1 second. Here is the job config file:

<?xml version="1.0" encoding="UTF-8" ?>
<workload name="s3-sample" description="sample benchmark for s3">
  <storage type="s3" config="accesskey=tmp;secretkey=tmp;endpoint=http://10.1.7.6:7481" />
  <workflow>
    <workstage name="main">
      <work name="main" workers="4" runtime="60">
        <operation type="write" ratio="100" config="cprefix=test;containers=c(5,5);oprefix=AA;objects=u(1,6);sizes=c(64)KB" />
      </work>
    </workstage>
  </workflow>
</workload>

10.1.7.6:7481 is where Nginx is listening at. OS is Ubuntu14.04 with jdk-7

Thanks !

cfanz commented 7 years ago

By the way, Cosbench is OK when using civetweb as frontend.

tianshan commented 7 years ago

looks like you are using ceph, you may config debug_rgw = 20 to see if cosbench request reach the rgw

cfanz commented 7 years ago

Wow, it seems that version 0.4.2.4c fix this problem. The problem is arised in 0.4.1.0 and 0.3.1.2. @tianshan , thank you for your advice, rgw has indeed received the request, and complete the request in time.