apple / ccs-calendarserver

The Calendar and Contacts Server.
https://www.calendarserver.org
Apache License 2.0
484 stars 136 forks source link

Database Connections #502

Closed sandeepksingh13 closed 6 years ago

sandeepksingh13 commented 6 years ago

Application is making too many database connections

dreness commented 6 years ago

Hello,

The relevant config parameters are:

MultiProcess -> ProcessCount - defines the number of worker processes used

MaxDBConnectionsPerPool - the number of DB connections allowed for each process

The maximum number of DB connections used by a CalendarServer host is ProcessCount * MaxDBConnectionsPerPool.

Maybe what you are really saying is "I know how to configure things. CalendarServer uses an unexpectedly high number of DB connections to perform my workload." This is a reasonable characterization in my experience, at least in some cases. Large-scale CalendarServer deployments likely need a strong DB server configured to allow a lot of concurrency, but it still might vary widely based on workload. For example, all CardDAV workloads are far simpler and cheaper to process than common CalDAV workloads involving scheduling with multiple users.

We do have an experimental SharedConnectionPool option that is described in the comment for MaxDBConnectionsPerPool:

<!-- The maximum number of outstanding database connections per database
     connection pool. When SharedConnectionPool (see above) is set to True,
     this is the total number of outgoing database connections allowed to the
     entire server; when SharedConnectionPool is False - this is the default -
     this is the number of database connections used per worker process. -->

Using this might let you get by with a lower number of DB connections, but whether this is advisable in your case depends a lot on the nature of your workload and the size / scale of your server(s). Be advised, though: SharedConnectionPool is considered experimental and hasn't been used at scale, as far as I know. In general, the Python concurrency model means we have to go somewhat wider with process count than might otherwise be necessary. There is a whole lot of space between a per-process DB connection pool (where connections are shared between 'threads' in that process) and a server-wide DB connection pool, but those are the only two options you have. In the first case, you have potentially a lot of idle or mostly idle DB connections, and in the second case you risk bottlenecking in the process hosting the shared pool. For serious deployments, the DB needs to be perfectly content to handle the maximum concurrency as configured by CalendarServer, because resource usage is often incredibly bursty even with common workloads, assuming you are optimizing for minimum request times. There is also the option of tuning the WorkQueue priority thresholds to penalize expensive work by deferring it during burst periods, but this really only works to smooth out the peaks and valleys. If you really want to get fine-grained, you can also customize the relative cost ('weight') and priority of each kind of WorkItem (also read all the comments in this file!). Each different kind of work has a corresponding WORK table in the DB. Sample config for making expensive CalDAV scheduling work less concurrent by raising the weight (which defaults to 5) might look like:

  <key>WorkQueue</key>
    <dict>
      <key>workParameters</key>
      <dict>
        <key>SCHEDULE_ORGANIZER_WORK</key>
        <dict>
          <key>priority</key>
          <integer>1</integer>
          <key>weight</key>
          <integer>7</integer>
        </dict>
        <key>SCHEDULE_ORGANIZER_SEND_WORK</key>
        <dict>
          <key>priority</key>
          <integer>1</integer>
          <key>weight</key>
          <integer>7</integer>
        </dict>
        <key>SCHEDULE_AUTO_REPLY_WORK</key>
        <dict>
          <key>priority</key>
          <integer>1</integer>
          <key>weight</key>
          <integer>7</integer>
        </dict>
        <key>SCHEDULE_REPLY_WORK</key>
        <dict>
          <key>priority</key>
          <integer>1</integer>
          <key>weight</key>
          <integer>7</integer>
        </dict>
        <key>SCHEDULE_REFRESH_WORK</key>
        <dict>
          <key>priority</key>
          <integer>1</integer>
          <key>weight</key>
          <integer>7</integer>
        </dict>
      </dict>
    </dict>

A good overview of your workload can be produced by using contrib/tools/protocolanalysis.py to make reports from your access logs. If you'd like to share a report and some details about your server hardware, perhaps we can make recommendations for tuning.

Does this help?

sandeepksingh13 commented 6 years ago

Thank you, your explanation was quite helpful and i was able to configure the Calendarserver to work reasonable number of connections to the database by decreasing the MaxDbConnectionPerPool.

On 4 November 2017 at 02:00, Andre LaBranche notifications@github.com wrote:

Hello,

The relevant config parameters are:

MultiProcess -> ProcessCount https://github.com/apple/ccs-calendarserver/blob/master/conf/caldavd-stdconfig.plist#L783

  • defines the number of worker processes used

MaxDBConnectionsPerPool https://github.com/apple/ccs-calendarserver/blob/master/conf/caldavd-stdconfig.plist#L1840

  • the number of DB connections allowed for each process

The maximum number of DB connections used by a CalendarServer host is ProcessCount * MaxDBConnectionsPerPool.

Maybe what you are really saying is "I know how to configure things. CalendarServer uses an unexpectedly high number of DB connections to perform my workload." This is a reasonable characterization in my experience, at least in some cases. Large-scale CalendarServer deployments likely need a strong DB server configured to allow a lot of concurrency, but it still might vary widely based on workload. For example, all CardDAV workloads are far simpler and cheaper to process than common CalDAV workloads involving scheduling with multiple users.

We do have an experimental SharedConnectionPool option that is described in the comment for MaxDBConnectionsPerPool:

Using this might let you get by with a lower number of DB connections, but whether this is advisable in your case depends a lot on the nature of your workload and the size / scale of your server(s). Be advised, though: SharedConnectionPool is considered experimental and hasn't been used at scale, as far as I know. In general, the Python concurrency model means we have to go somewhat wider with process count than might otherwise be necessary. There is a whole lot of space between a per-process DB connection pool (where connections are shared between 'threads' in that process) and a server-wide DB connection pool, but those are the only two options you have. In the first case, you have potentially a lot of idle or mostly idle DB connections, and in the second case you risk bottlenecking in the process hosting the shared pool. For serious deployments, the DB needs to be perfectly content to handle the maximum concurrency as configured by CalendarServer, because resource usage is often incredibly bursty even with common workloads, assuming you are optimizing for minimum request times. There is also the option of tuning the WorkQueue https://github.com/apple/ccs-calendarserver/blob/master/conf/caldavd-stdconfig.plist#L226 priority thresholds to penalize expensive work by deferring it during burst periods, but this really only works to smooth out the peaks and valleys. If you really want to get fine-grained, you can also customize the relative cost ('weight') and priority of each kind of WorkItem https://github.com/apple/ccs-twistedextensions/blob/master/twext/enterprise/jobs/workitem.py (also read all the comments in this file!). Each different kind of work has a corresponding WORK table in the DB. Sample config for making expensive CalDAV scheduling work less concurrent by raising the weight (which defaults to 5) might look like:

WorkQueue
<dict>
  <key>workParameters</key>
  <dict>
    <key>SCHEDULE_ORGANIZER_WORK</key>
    <dict>
      <key>priority</key>
      <integer>1</integer>
      <key>weight</key>
      <integer>7</integer>
    </dict>
    <key>SCHEDULE_ORGANIZER_SEND_WORK</key>
    <dict>
      <key>priority</key>
      <integer>1</integer>
      <key>weight</key>
      <integer>7</integer>
    </dict>
    <key>SCHEDULE_AUTO_REPLY_WORK</key>
    <dict>
      <key>priority</key>
      <integer>1</integer>
      <key>weight</key>
      <integer>7</integer>
    </dict>
    <key>SCHEDULE_REPLY_WORK</key>
    <dict>
      <key>priority</key>
      <integer>1</integer>
      <key>weight</key>
      <integer>7</integer>
    </dict>
    <key>SCHEDULE_REFRESH_WORK</key>
    <dict>
      <key>priority</key>
      <integer>1</integer>
      <key>weight</key>
      <integer>7</integer>
    </dict>
  </dict>
</dict>

A good overview of your workload can be produced by using contrib/tools/ protocolanalysis.py https://github.com/apple/ccs-calendarserver/blob/master/contrib/tools/protocolanalysis.py to make reports from your access logs. If you'd like to share a report and some details about your server hardware, perhaps we can make recommendations for tuning.

Does this help?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/apple/ccs-calendarserver/issues/502#issuecomment-341818435, or mute the thread https://github.com/notifications/unsubscribe-auth/AfrN43ftK6TjvQY25soHKFQ-JDoprS-hks5sy3fOgaJpZM4QKsYG .

dreness commented 6 years ago

@sandeepksingh13 Glad to hear it.

The default ProcessCount setting of 0 means: dynamically choose a value that is the lesser of installed RAM in GB or the number of CPU cores, but at least 2. On a large system where CalendarServer has only a modest workload, the default ProcessCount might be significantly higher than needed, and setting this to a lower value would free up both DB concurrency as well as memory and file descriptors on the CalendarServer host. Reducing only MaxDbConnectionPerPool reduces maximum concurrency against the DB without affecting the HTTP concurrency limit in CalendarServer (ProcessCount * MaxRequests). MaxRequests defaults to 3, because most HTTP requests are not CPU-bound, so having some in-process HTTP concurrency is useful. The "Load Analysis" section of a protocolanalysis report includes an "average queue depth" column, which is the average number of 'slots' that were in use across all the processes for the averaging interval. A value equal to MaxRequests indicates that CalendarServer is at or near maximum HTTP concurrency.

In this example, I used the client sim to apply the gentle default load, then I turned the client concurrency way up.

Load Analysis
+---------------+----------+--------------+--------------+-----------+
| Local (UTC)   | Total    | Av. Requests | Av. Response | Av. Queue |
|               | Requests | Per Second   | Time(ms)     | Depth     |
|---------------+----------+--------------+--------------+-----------|
| 00:00 (07:00) |        0 |          0.0 |          0.0 |      0.00 |
| 00:01 (07:01) |        1 |          0.0 |         24.8 |      1.00 |
| 00:02 (07:02) |        2 |          0.0 |         15.4 |      1.00 |
| 00:03 (07:03) |        0 |          0.0 |          0.0 |      0.00 |
| 00:04 (07:04) |       80 |          1.3 |         45.3 |      1.00 |
| 00:05 (07:05) |      328 |          5.5 |         35.4 |      1.00 |
| 00:06 (07:06) |      418 |          7.0 |         37.0 |      1.00 |
| 00:07 (07:07) |      567 |          9.4 |         35.0 |      1.00 |
| 00:08 (07:08) |      523 |          8.7 |         28.4 |      1.00 |
| 00:09 (07:09) |      333 |          5.5 |         29.9 |      1.00 |
| 00:10 (07:10) |        0 |          0.0 |          0.0 |      0.00 |
| 00:11 (07:11) |        0 |          0.0 |          0.0 |      0.00 |
| 00:12 (07:12) |      264 |          4.4 |         15.0 |      1.00 |
| 00:13 (07:13) |     3809 |         63.5 |         52.6 |      1.09 |
| 00:14 (07:14) |     4745 |         79.1 |         68.7 |      1.21 |
| 00:15 (07:15) |     2164 |         36.1 |        224.3 |      2.72 |
| 00:16 (07:16) |        0 |          0.0 |          0.0 |      0.00 |
sandeepksingh13 commented 6 years ago

Hi, Can you please give some information regarding the oracle database connection configuration, i could not find any reference as how to set up oracle database. Also is it possible to use elastic caching services of amazon instead of memcached.

Regards Sandeep Kumar Singh

On 4 November 2017 at 13:10, Andre LaBranche notifications@github.com wrote:

@sandeepksingh13 https://github.com/sandeepksingh13 Glad to hear it.

The default ProcessCount setting of 0 means: dynamically https://github.com/apple/ccs-calendarserver/blob/76d2d7802125edb297365d556d0d6cbd11822def/twistedcaldav/util.py#L121 choose a value that is the lesser of installed RAM in GB or the number of CPU cores, but at least 2. On a large system where CalendarServer has only a modest workload, the default ProcessCount might be significantly higher than needed, and setting this to a lower value would free up both DB concurrency as well as memory and file descriptors on the CalendarServer host. Reducing only MaxDbConnectionPerPool reduces maximum concurrency against the DB without affecting the HTTP concurrency limit in CalendarServer (ProcessCount * MaxRequests). MaxRequests defaults to 3, because most HTTP requests are not CPU-bound, so having some in-process HTTP concurrency is useful. The "Load Analysis" section of a protocolanalysis report includes an "average queue depth" column, which is the average number of 'slots' that were in use across all the processes for the averaging interval. A value equal to MaxRequests https://github.com/apple/ccs-calendarserver/blob/master/conf/caldavd-stdconfig.plist#L1828 indicates that CalendarServer is at or near maximum HTTP concurrency.

In this example, I used the client sim to apply the gentle default load, then I turned the client concurrency way up.

Load Analysis +---------------+----------+--------------+--------------+-----------+ | Local (UTC) | Total | Av. Requests | Av. Response | Av. Queue | | | Requests | Per Second | Time(ms) | Depth | |---------------+----------+--------------+--------------+-----------| | 00:00 (07:00) | 0 | 0.0 | 0.0 | 0.00 | | 00:01 (07:01) | 1 | 0.0 | 24.8 | 1.00 | | 00:02 (07:02) | 2 | 0.0 | 15.4 | 1.00 | | 00:03 (07:03) | 0 | 0.0 | 0.0 | 0.00 | | 00:04 (07:04) | 80 | 1.3 | 45.3 | 1.00 | | 00:05 (07:05) | 328 | 5.5 | 35.4 | 1.00 | | 00:06 (07:06) | 418 | 7.0 | 37.0 | 1.00 | | 00:07 (07:07) | 567 | 9.4 | 35.0 | 1.00 | | 00:08 (07:08) | 523 | 8.7 | 28.4 | 1.00 | | 00:09 (07:09) | 333 | 5.5 | 29.9 | 1.00 | | 00:10 (07:10) | 0 | 0.0 | 0.0 | 0.00 | | 00:11 (07:11) | 0 | 0.0 | 0.0 | 0.00 | | 00:12 (07:12) | 264 | 4.4 | 15.0 | 1.00 | | 00:13 (07:13) | 3809 | 63.5 | 52.6 | 1.09 | | 00:14 (07:14) | 4745 | 79.1 | 68.7 | 1.21 | | 00:15 (07:15) | 2164 | 36.1 | 224.3 | 2.72 | | 00:16 (07:16) | 0 | 0.0 | 0.0 | 0.00 |

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/apple/ccs-calendarserver/issues/502#issuecomment-341878434, or mute the thread https://github.com/notifications/unsubscribe-auth/AfrN45VgRjCaiUIA_mjA1abn-z_OiEEGks5szBTpgaJpZM4QKsYG .