spring-projects / spring-session

Spring Session
https://spring.io/projects/spring-session
Apache License 2.0
1.86k stars 1.11k forks source link

Add Dropwizard Metrics #535

Open solidjb opened 8 years ago

solidjb commented 8 years ago

Add Dropwizard Metrics

solidjb commented 8 years ago

It looks like all of the SessionRepositories that currently implement SessionRepository do the same sort of check for expiry on the get, and then delete the session. Could we just create an abstract class to encapsulate that logic? Something like:

package org.springframework.session;

/**
 * A class that facilitates deleting expired sessions when they are found.
 */
public abstract class ExpiringSessionRepository implements SessionRepository<ExpiringSession> {

    @Override
    public ExpiringSession getSession(String id) {
        ExpiringSession saved = findSessionBy(id);
        if (saved == null) {
            return null;
        }
        if (saved.isExpired()) {
            delete(saved.getId());
            return null;
        }
        return saved;
    }

    protected abstract ExpiringSession findSessionBy(String id);
}
solidjb commented 8 years ago

Alternatively, the metrics counter could be based upon org.springframework.session.events.SessionCreatedEvent and org.springframework.session.events.SessionDestroyedEvent, but I am not sure that session duration would be available (or calculatable) from the session that resides within the event.

rwinch commented 8 years ago

@solidjb This is not exactly true since it is store specific. For example, Redis can have an expiration set on the keys itself. This means the session can be expired even if delete has never been invoked.

We could consider listening for the events. However, the event happens on every node. This means that we would get (number of sessions) * (number of nodes) which would not work.

If we tried to limit to a single node, this could be difficult because we wouldn't know what to do if a node goes down (we'd need leadership election which is not trivial).

solidjb commented 8 years ago

I think trying to limit nodes adds complexity we probably would not want to tackle. Would adding the metrics recording to all of the concrete classes, but having it be able to be turned off/on with a flag be a possibility?

rwinch commented 8 years ago

@solidjb Thanks for the response.

I think trying to limit nodes adds complexity we probably would not want to tackle.

I'm not sure I understand. Are you suggesting we don't support metrics for multiple nodes? We cannot reduce the problem to only supporting a single node because multiple nodes is one of the fundamental problems Spring Session solves.

Would adding the metrics recording to all of the concrete classes, but having it be able to be turned off/on with a flag be a possibility?

Unfortuantely, that won't provide an accurate picture in many cases. The problem is that expired sessions in the data store won't be accounted for.

solidjb commented 8 years ago

Sorry for the confusion. I was trying to say that deduplicating events that happened across all nodes after the fact would be adding more complexity than we want to handle.

How many of the Repository implementations support offline cleanup of expired sessions? You called out redis. The mongo java doc references a 1 minute cleanup. Do the jdbc or map implementations have support classes that would also clean up offline?

solidjb commented 8 years ago

I am leaving everything below for historic purposes...

For hazelcast, I have verified that the SessionEntryListener is called on every node in the cluster for every event. That is the opposite of what I was hoping, please ignore everything below for now, until we come up with a better idea... :(


I am looking at org.springframework.session.hazelcast.SessionEntryListener. There are methods that publish Session events based on actions happening inside of hazelcast. For example:

public void entryAdded(EntryEvent<String, ExpiringSession> event) {
    if (logger.isDebugEnabled()) {
        logger.debug("Session created with id: " + event.getValue().getId());
    }
    this.eventPublisher.publishEvent(new SessionCreatedEvent(this, event.getValue()));
}

If we were to add code to the above method to increment the DropWizard Counter, would that happen N times (where N is number of nodes in cluster) or would it only happen 1 time (but the corresponding event is propagated N times?)

If that is the case, then we could add metrics code here for hazelcast (gated by some sort of property - spring.session.metrics.enabled?).

It looks like similar things happen with Gem:

@Override
public void afterCreate(EntryEvent<Object, ExpiringSession> event) {
    if (isExpiringSessionOrNull(event.getNewValue())) {
        handleCreated(event.getKey().toString(),
                toExpiringSession(event.getNewValue()));
    }
}

and in redis:

public void onMessage(Message message, byte[] pattern) {
    byte[] messageChannel = message.getChannel();
    byte[] messageBody = message.getBody();
    if (messageChannel == null || messageBody == null) {
        return;
    }

    String channel = new String(messageChannel);

    if (channel.startsWith(getSessionCreatedChannelPrefix())) {
        // TODO: is this thread safe?
        Map<Object, Object> loaded = (Map<Object, Object>) this.defaultSerializer
                .deserialize(message.getBody());
        handleCreated(loaded, channel);
        return;
    }

    String body = new String(messageBody);
    if (!body.startsWith(getExpiredKeyPrefix())) {
        return;
    }

    boolean isDeleted = channel.endsWith(":del");
    if (isDeleted || channel.endsWith(":expired")) {
        int beginIndex = body.lastIndexOf(":") + 1;
        int endIndex = body.length();
        String sessionId = body.substring(beginIndex, endIndex);

        RedisSession session = getSession(sessionId, true);

        if (logger.isDebugEnabled()) {
            logger.debug("Publishing SessionDestroyedEvent for session " + sessionId);
        }

        cleanupPrincipalIndex(session);

        if (isDeleted) {
            handleDeleted(sessionId, session);
        }
        else {
            handleExpired(sessionId, session);
        }

        return;
    }
}

even jdbc:

@Scheduled(cron = "0 * * * * *")
public void cleanUpExpiredSessions() {
    long now = System.currentTimeMillis();
    int maxInactiveIntervalSeconds = (this.defaultMaxInactiveInterval != null)
            ? this.defaultMaxInactiveInterval
            : MapSession.DEFAULT_MAX_INACTIVE_INTERVAL_SECONDS;

    final long sessionsValidFromTime = now - (maxInactiveIntervalSeconds * 1000);

    if (logger.isDebugEnabled()) {
        logger.debug(
                "Cleaning up sessions older than " + new Date(sessionsValidFromTime));
    }

    int deletedCount = this.transactionOperations.execute(new TransactionCallback<Integer>() {

        public Integer doInTransaction(TransactionStatus transactionStatus) {
            return JdbcOperationsSessionRepository.this.jdbcOperations.update(
                    getQuery(DELETE_SESSIONS_BY_LAST_ACCESS_TIME_QUERY),
                    sessionsValidFromTime);
        }

    });

    if (logger.isDebugEnabled()) {
        logger.debug("Cleaned up " + deletedCount + " expired sessions");
    }
}

I am very ignorant about all of this, but it looks to me like these lines of code above all are called by the underlying database technology to ensure that the spring session code is kept up to date with the state of the store. If we create NoOp versions of the Counter and Histogram, we wouldn't even have to put conditional code anywhere except the configuration modules. All the individual implementations would just call the metrics code. If the flag was false, the no op versions would be injected and do nothing. If the flag was true, then we would plug in full implementations of the Counter and Histogram.

What do you think, @rwinch ?

solidjb commented 8 years ago

It looks like for Hazelcast there are two different types of EventListeners:

/**
* Adds a {@link MapListener} for this map. To receive an event, you should
* implement a corresponding {@link MapListener} sub-interface for that event.
*
* @param listener     {@link MapListener} for this map.
* @param includeValue <tt>true</tt> if <tt>EntryEvent</tt> should
*                     contain the value.
* @return A UUID.randomUUID().toString() which is used as a key to remove the listener.
* @see MapListener
*/
String addEntryListener(MapListener listener, boolean includeValue);

and

/**
 * Adds a {@link MapListener} for this map. To receive an event, you should
 * implement a corresponding {@link MapListener} sub-interface for that event.
 * <p/>
 * Note that entries in distributed map are partitioned across
 * the cluster members; each member owns and manages the some portion of the
 * entries. Owned entries are called local entries. This
 * listener will be listening for the events of local entries. Let's say
 * your cluster has member1 and member2. On member2 you added a local listener and from
 * member1, you call <code>map.put(key2, value2)</code>.
 * If the key2 is owned by member2 then the local listener will be
 * notified for the add/update event. Also note that entries can migrate to
 * other nodes for load balancing and/or membership change.
 *
 * @param listener {@link MapListener} for this map.
 * @return A UUID.randomUUID().toString() which is used as a key to remove the listener.
 * @see #localKeySet()
 * @see MapListener
 */
 String addLocalEntryListener(MapListener listener);

So for Hazelcast at least, we could add a local listener for metrics (based upon spring.session.metric.enabled = true) The listener could be completely separate code from the current listener, and therefore only configuration would need to react to the flag. (Allowing the DropWizard dependency to be optional) It may be that the other implementations (redis, gem, etc.) Also support something like a local listener.

tsachev commented 8 years ago

I think for Hazelcast apps may be configured as clients and not be part of the cluster. In this case I would expect that local listeners are never called.

solidjb commented 8 years ago

With my initial tests, the local create listener is called. However, the local remove listener was not. The jury is still out on the evict listener. I have more tests to go through.

solidjb commented 8 years ago

Follow up. Create and Evict are both called appropriately. Hooray! However, it is clear that if you do some sort of rolling install, then the in-flight statistics are lost, and you can get negative numbers. Bad :(

I am starting to think that there is literally no good solution for this because of the in-datastore evictions...