ratchetphp / Ratchet

Asynchronous WebSocket server
http://socketo.me
MIT License
6.28k stars 743 forks source link

Session writing, a working draft. Interested in feedback before push. #142

Open tomsseisums opened 11 years ago

tomsseisums commented 11 years ago

I'm not ready to pull request it, because what I did here was a total blindshot.

The implementation, actually, was pretty simple (and that's my main concern), posting this issue for review-discussion.

I have tested only with Symfony's PdoSessionHandler.


Ratchet\Session\SessionProvider:

The changes are minimal, only for onClose - added the Session::save call.

function onClose(ConnectionInterface $conn) {
    // "close" session for Connection

    $conn->Session->save();

    return $this->_app->onClose($conn);
}

Ratchet\Session\Storage\VirtualSessionStorage:

This is where the most additions come in, yes, to the newly invoked save method.

  1. As Chris had suggested in method comments, I took data from classic Bags, Metadata Bag and appended it to a temporary array for serialization.
  2. I pass it forward to Serializer::serialize for serialization.
  3. Serialized data is passed to saveHandler, which in my setup is Symfony\Component\HttpFoundation\Session\Storage\Handler\PdoSessionHandler.
public function save() {
    // get the data from the bags?
    // serialize the data
    // save the data using the saveHandler
//        $this->saveHandler->write($this->saveHandler->getId(), 

    $toSerialize = array();

    // Push calssic bags for serialization.
    foreach ($this->bags as $bag)
    {
        $toSerialize[$bag->getStorageKey()] = $bag->all();
    }

    // Get metadata bag.
    $meta = $this->metadataBag;

    // Stamp it.
    $meta->stampNew();

    // Push for serialization.
    $toSerialize[$meta->getStorageKey()] = array(
        MetadataBag::UPDATED => $meta->getLastUsed(),
        MetadataBag::CREATED => $meta->getCreated(),
        MetadataBag::LIFETIME => $meta->getLifetime()
    );

    // Serialize the data.
    $serialized = $this->_serializer->serialize($toSerialize);

    // Commit to storage.
    $written = $this->saveHandler->write($this->saveHandler->getId(), $serialized);

    if (!$this->saveHandler->isWrapper() && !$this->getSaveHandler()->isSessionHandlerInterface()) {
        $this->saveHandler->setActive(false);
    }

    $this->closed = true;
}

Ratchet\Session\Serialize\PhpHandler:

Last, but not least, the RuntimeException "Serialize PhpHandler:serialize code not written yet, write me!" has got an implementation.

function serialize(array $data) {
    $returnData = '';

    foreach ($data as $key => $values)
    {
        $returnData .= $key . '|' . serialize($values);
    }

    return $returnData;
}

My concerns are about saving and serialization.

Saving: I'm unsure if a simple save onClose will properly close the session. Previosuly there was no specific close handling, may be it's automated, didn't dig deeper to find out every detail.

Serialization: I based the serializators output on the rawData input. My goal was to exactly match the input data. For that purpose it works, but I'm not sure it it will match everything.


Looking forward to feedback.

WyriHaximus commented 11 years ago

The thing that worries me is if you only write is onClose you might overwrite something in the session that was done after the WS connection was opened.

For example:

How to deal with those kinds of situations? Should the session be polled once in a white for new data? Or build the application to go AJAX or WS only?

P.S. I really like the idea of writable session :+1:

tomsseisums commented 11 years ago

Oh, I just noticed, that Session should be saved only after the Application close event, because, developers might change the session when closing the connection. I needed it.

So, SessionProvider's code becomes:

function onClose(ConnectionInterface $conn) {
    // "close" session for Connection

    $appClose = $this->_app->onClose($conn);

    $conn->Session->save();

    return $appClose;
}

@WyriHaximus, yes, your concern is valid, didn't think of it. Well, that has to be fixable, at least, to some extent - some concurrency issues will still remain.

A quick and dirty fix would be to, before saving it from WS:

  1. Cache current session parameters.
  2. Reload from storage.
  3. Merge.
  4. Save.

My production environment, where I needed the functionality, uses HTTP + WS only, no AJAX. Quite possibly that in future there will be some, but I'll probably build around with the concurrency issues in mind.

tomsseisums commented 11 years ago

I just ran into an issue that WS session wasn't yet saved, so, saving onClose will not simply cut it. We need more reads, and more saves. I think they should be both, waiting on interval and listening for events. Though, there might be better alternatives.


Did a tempfix, that saves on each event by using Symfony\Component\HttpFoundation\Session\Session::migrate (resolves to Ratchet\Session\Storage\VirtualSessionStorage::regenerate).

Tempfix, because, normally, before saving, I should get actual data from storage again and merge with changes, but, I'm only saving for now.

The result works, though. And, giving it a thought, seems, that it actually should be enough to circumvent any concurrency issues.

WyriHaximus commented 11 years ago

Well we could read => merge => save. And poll to keep it up to date for reading.

tomsseisums commented 11 years ago

@WyriHaximus, yeah, that's what I meant. Plus, read the update on previous post, we shouldn't be polling, I feel that it should be enough to do it simply on every event - message, open, close.

cboden commented 11 years ago

Great work so far @joltmode and great discussion guys! Writing to sessions is something I absolutely want to implement. This goes hand-in-hand with #134 .

The users WebSocket connection, in most circumstances, will live over the span of other requests that may update the session, most likely XMLHTTPRequests made in an application while the WebSocket remains active. @joltmode was on to this and for this reason I don't think save should be called onClose as there could easily be a ton of conflicts and stale data. I would opt to give this power to the user and let them call save() when they want to write to the session.

Reading, logistically, may be a lot more complicated. I believe one option could be if using Redis have events update the Ratchet Session data as it changes in the database. This would be a perfect scenario where anytime the developer access a session variable s/he knows it's the latest value.

Alternatively, none of the other technologies, to the best of my knowledge, will give us this feature. If we're using MySQL (for example) and having it served asynchronously via a child process I don't think we can get auto-updating session values. We could do something like allow the user to refresh the session data on request. It would async ask to update the data returning a promise when it's complete. My only concern is if the user does this in a loop it could be a lot of single queries to update a lot of sessions frequently.

tomsseisums commented 11 years ago

A simple onOpen load and onClose save - yes, I agree, that could cause a lot of problems. And actually, it already did, so I looked further.


Some options, based on @cboden observations


Technical look

What I was thinking, though, is that whenever you want to do something with WS, it will be reflected on server with events. There is no escaping these events.

Lets take a technical look at the exact sample @WyriHaximus mentioned, with the WD (working draft) solution:

  1. User opens application (HTTP GET)
  2. WS connection is setup and session loaded (HTTP UPGRADE, WS onOpen)
  3. User logs into the application over AJAX (XMLHTTPRequest (HTTP anyways) GET/POST/PUT/DELETE/*, sessionChange)
  4. Does something (HTTP GET/POST/PUT/DELETE*)
  5. Refreshes the page because he/she thinks it's frozen (WS onClose)
  6. Ratchet writes the old session data where the user isn't logged in yet (is the result of above event)
  7. User has to login again (Is the result of concurrency)

With my initial example, we have a general (because the session will be saved before the Applications onClose) problem. With my first fix, we have the concurrency problem.

I did an upgrade later on, that I described poorly in my penultimate post and I will expand upon here.


Smart(?) workaround

What I want to emphasize is, that Ratchet is always on top of events, there is no escaping them. You can close the browser in the middle of a long AJAX request and it could get halted, but WebSockets? Hell no. At least I haven't yet found a way to escape.

I suggest, for Ratchet, (and already made it in my production environment) to, before dispatching incoming events to Application (MessageComponentInterface), reload the session. Then, after the Application has given it's kick to the event (and possibly the session), save it. I see it, actually, unnecessary to check for changes and merge here, the data should be valid.

Now, that gives us:

  1. User opens application (Start/Load Session = #1)
  2. WS connection is setup and session loaded (Ratchet onOpen, Load Session = #1, Application onOpen, Save Session = #2)
  3. User logs into the application over AJAX (Load Session = #2, Save Session = #3)
  4. Does something
  5. Refreshes the page because he/she thinks it's frozen (Ratchet onClose, Load Session = #3, Application onClose, Save Session = #4) 5.1. For the "reconnect" (Ratchet onOpen, Load Session = #4, Application onOpen, Save Session = #5)
  6. Ratchet writes the old session data where the user isn't logged in yet (No, it does not, because it updates the session upon every event.)
  7. User has to login again

#n is the Session "state" number.

This still leaves us with a little overhead - each event loads/saves the session, but hey, that's the same thing HTTP does. And also AJAX - yes, for that little JSON response, we actually hook up everything in our applications. Though, event handling are the only moments, when Ratchet has to be aware of the session changes. We can reduce some overhead by committing to save only if the session has changed. Also, this doesn't make Ratchet to save the session on each set/remove/*.


I'm actually very satisfied with the result that the above implementation provides (I haven't implemented the changed session state check, though).

Oops, turned out to be a little longer than I expected. :astonished:

WyriHaximus commented 10 years ago

For the things you wrong down it's not long @joltmode :smile: and it puts down a real good solution for the problem.

:+1: for going Redis, the keyspace notifications will become really useful with this. And can inform us when the specific key changes for a given session. I'm with @cboden that really is the only viable option to do this properly without causing to much load on the infrastructure right now. Those evens would also mean we don't have to fetch the session for each event cause we get notified if it changes and only have to fetch it when it is changed.

Regarding saving/updating session data. Modifying the session could mean saving it, which is fine when you only want to update one value within the session. Or when you do it in the event it self, Ratchet can notice the change during the event and write the new session. But what about session modification in a callback that is called long after the event has been finished? And you want to set 4 new values, delete 3 and update 5 values on top of that? that gives you 12 write operations towards Redis. While you can combine that into one write operation. That is why I like to propose beginTransaction and commitTransaction on the session object, starting a transactiong and preventing the session write untill you commit the transaction once your done modifieing the session. So that in cases where a you aren't inside the event anymore you can still modify the session.

cboden commented 10 years ago

Sorry for the delay in my response, it's been far too long on my part. There's a lot of good ideas and solutions in this thread.


Multiple extensions for SessionProvider: read only, read+manual write, read+auto write?

I think the principle of this idea is good. There are a lot of possibilities that have been covered in this discussion and I don't think it's possible to achieve every developers desires with one implementation. My only slight reservation is this would be an adjustment or addition of a requirement of the SessionProvider. SessionProvider is a link in the decorator chain of a Ratchet app that is a Facade accepting configurations + objects in order to provide a Session object to a Connection. I imagine instead of hard coding a VirtualSessionStorage this would be configurable in the way @joltmode described.


While I once entertained the idea of only enabling Redis because of the way it would keep a session in sync I don't like the idea of forcing developers both into a storage as well as a flow. The reason I chose Symfony Sessions initially was near seamless compatibility with a sync stack. That being said I would like to see an implementation that developers users with Redis sessions to have this always in sync session data option.


My apologies as this may sound dismissive, but I don't want to tackle "merging" at this point. Timing issues, race conditions, state management, and especially conflict resolution are a bag of hurt. A philosophy I keep going back to is I'd rather be explicit and implicit. Merging is something that can be left to the developer. I'm thinking once the session is opened the data remains as-is which means it could go stale (given they're not using the Redis-sync adapter we've though up). If a developer wants up to date session data, that may have been changed externally, they can do a fetch/reload/refresh on the Session. If a developer is worried about conflicts they can copy the session data before the do this and make the comparison themselves.

@WyriHaximus This paradigm would exhibit your transactional model. Nothing commits until the developer manually fetches or saves session data.


Following up with being explicit I don't think we should save the session onClose for the following reasons:


One last concern to look into: Symfony closely follows what PHP native sessions do. We need to make sure we can comfortably get around calling save() on a Session without forcing the session into a "closed" state.