vaadin / flow

Vaadin Flow is a Java framework binding Vaadin web components to Java. This is part of Vaadin 10+.
Apache License 2.0
589 stars 164 forks source link

Stateless Mode #13298

Open pleku opened 6 years ago

pleku commented 6 years ago

@vaadin-bot commented on Fri Jan 28 2011

Originally by @jojule


The proposal outlines a new operation mode for Vaadin applications where HttpSession or server-side state is not needed for applications at all. This is an optional operation mode that could widen the scope of application where Vaadin could be used, not a default or recommended mode of operation.

Enabling

Set parameter stateless = true in web.xml for the ApplicationServlet.

Implementation

After each http-request, the application state is serialized and removed from the HttpSession. The serialized application state sent along UIDL to the client. Client stores the application state in a javascript variable. Whenever the client contacts the server, it also sends the application state along the request parameters.

Security

The implementation must guarantee that the client can not modify the application state. This is done by maintaining a random salt in server and adding a SHA-2 checksum calculated from the serialized state+salt in the end of the state.

For many applications, the state might not contain any secrets that can not be revealed to the client. Still in some cases the developer chooses to store some server-side secrects (such as password for DB connection). In order to keep the state secret, the state is encrypted with a random key stored in the server using AES-256.

Open question: how the key and salt are stored and shared between the servers in clustered configurations?

Optimizations

The serialization state size can grow quite large. Thus the state should be compressed with GZIP before sending to client. The effect of encryption to performance should be measured and if the effect is noticeable, a parameter for turning encryption off should be added to web.xml. Still the encryption must be on by default.

Quick tests show for a really small application, state can be around 8kb, but it can be compressed to 3.5kb with GZIP. Each additional TextField (with a random caption ("foo" + Math.random()) adds the size of the serialized session by 134 bytes. When GZIP is used, the state size per TextField goes down to 13 bytes and thus is just the size of the captions.

Research if using a stored seed copy of an app after init() as a seed to compression algorithm would make the compression more efficient. One way to initialize the compression dictionary is to use http://download.oracle.com/javase/1.5.0/docs/api/java/util/zip/Deflater.html#setDictionary(byte[]).

Use cases

Feasibility

Stateless mode could be useful and practical for really small applications with only a bit of state. A practical limit for the (compressed) state size could be some tens of kilobytes.

Serialization time for an application with some 1000 UI components (textfields) resulting to 152kb of (uncompressed, unencrypted, unsigned) state is 3ms in 2.4GHz core duo MacBook. By adding more components, we measured the serialization time to be proportional to the number of components in the application. Guesstimate for overhead including both serialization, deserialization, compression, encryption and signing is around 0.01ms / component in the application.

Simple applications embedded to public facing web-sites are the most likely users of the stateless mode. These applications include registration forms, small calendars, insurance calculators, questionaires and such mini-applications. For these applications the number of components is typically something between 10-100. If we guesstimate the average to be 30 components, we may estimate that overhead for such application is 0,3ms of server CPU time and 10kB of combined in+out data transfer per http-request. Lets assume that an user causes in average 15 UIDL ajax requests per day. A public facing system with 1 million daily users would add just 5% overhead to one server CPU for serializations. A more visible cost would be the additional 150GB of combined I+O bandwidths needed (in Amazon EC2 this would cost about 15 USD / day).

This implementation should probably be ad add-on to Vaadin implemented on top of vaadin/flow#7207


@vaadin-bot commented on Fri Dec 09 2016

Originally by @jojule


The feasibility study numbers might be underestimated.

I measured the effect of dictionary pre-seeding with the simple address book tutorial app (but only with one row of data in table). When editing an address in form, the serialized data is 21050 bytes. It compresses down to 6758 bytes with default java.util.zip.Deflater. If another application state snapshot is created just after init and it is used for dictionary seeding, the compressed size goes down to 3071 bytes.

Total time for serialization (including serialization, compression and dictionary seeding) is 3ms on my 2.4GHz core 2 duo MacBook. The compression is actually quite slow as the serialization was just 0.5ms with the same setup.

Added also AES-256 encryption. It adds just 0.2ms on top of the compression.

Even with these numbers the additional CPU load is not a problem for low complexity applications.


@vaadin-bot commented on Fri Dec 09 2016

Originally by kynao


Regarding the opened question: "how the key and salt are stored and shared between the servers in clustered configurations? "

Why not using OrientDB (http://www.orientechnologies.com/) released under Apache 2.0 licence. OrientDB is a free java opensource NoSQL DB with transaction support, really fast, lightweight, actively maintained with capabilites in both document/graph/object orientations but still supporting SQL. Clustering features are almost done.

Any opinions ?


@vaadin-bot commented on Fri Dec 09 2016

Originally by @hezamu


Replying to kynao:

Regarding the opened question: "how the key and salt are stored and shared between the servers in clustered configurations? "

All clustering solutions offer a way to share data between nodes, but IMO we don't want to rely on those.

This means that we need an external solution for the problem. OrientDB would work, but something like memcached might be simpler and easier.


@vaadin-bot commented on Fri Dec 09 2016

Originally by @jojule


Sharing the server.side secrets should not depend on any external solution. The most simple solution would be to just include a passphrase in server configuration and use that to generate the secrets. Web.xml, code and anything else stored in the war would probably be a bad place as it is too easy to just forget that they are there and thus too easy to leak them. These secrets should be regarded in the same way as SSL certificate private keys. That said - we could easily generate them from the certificate private key if SSL is used.


@vaadin-bot commented on Fri Dec 09 2016

Originally by kynao


Replying to Muurimaa:

Replying to kynao:

Regarding the opened question: "how the key and salt are stored and shared between the servers in clustered configurations? "

All clustering solutions offer a way to share data between nodes, but IMO we don't want to rely on those.

This means that we need an external solution for the problem. OrientDB would work, but something like memcached might be simpler and easier.

I read much things for which finally a dedicated db is more suited than memcache but i'm not sure it's the right place to discuss of this, maybe in the forum instead ? I 'also not sure orientdb is a complicated thing :), maybe we could invite Luca (orientdb author) to discuss with us if you want.


@vaadin-bot commented on Fri Dec 09 2016

Originally by Jani Laakso


Security tip: You must add also timestamp into serialized state and compare against this timestamp on the Vaadin server, so client cannot revive / duplicate old states.

Here's an wild idea: How about setting up a "Vaadin state proxy server" to some trustworthy service like Amazon? Clients (browsers) would always connect into Amazon server and Amazon would act as a proxy with request / response cycle regarding the real Vaadin servers and take care of state persistence in the middle. This way browser would not need to store state at all, only cookie is required. Of course this adds more latency but networks are very fast. In practice this means offloading states of Vaadin servers into one big server. Obviously using browsers to store the state is the best solution, this idea is just for brainstorming other possibilities.

Note, Henri & Joonas: I have some experience regarding OrientDB but I do not understand how it relates to this feature?

PS. This feature is really nice!


@vaadin-bot commented on Fri Dec 09 2016

Originally by Jani Laakso


Java serialization is quite slow and inefficient. Consider e.g. Kryo which is interesting product, I've toyed with it few months ago with my transparent persistency framework.

Kryo also comes with DeltaCompressor, which caches bytes for objects that were serialized for a specific receiver. Subsequent serialization for the same receiver results only in bytes that describe the delta from the last serialization. This can greatly reduce the number of bytes needed to serialize an object that seldom changes dramatically. This feature would drastically reduce bandwidth requirements because server would send only deltas back to clients.

If using Kryo is not an option, we might be able to create diffs on the server and send only those back to client. Of course this would require either sending initial serialization data + deltas from client to server and periodically sending full serialization data back to client (in case cumulative deltas are growing too large) or somehow merging deltas back to serialized state data on the client. Using simple diff would work, this should be reasonably simple to merge on the client side.

More information here: http://code.google.com/p/kryo/wiki/BenchmarksAndComparisons


@vaadin-bot commented on Fri Dec 09 2016

Originally by Funtick


@Jani Laakso: KRYO can't deserialize java.lang.Locale ;)

@all I have success with Sticky and Non-Sticky(!!!) serializations with Infinispan Memcached, VAADIN v. 6.8.9, Tomcat 6.x, and http://code.google.com/p/memcached-session-manager/

However, I encountered problems such as... com.vaadin.event.Action class not implementing "equals()", com.vaadin.event.ListenerMethod not outputting properly cause of serialization/deserialization exception, trickiest errors with inner ENUMs, serializing "Method", SimpleFileUpload field using ByteArrayOutputBuffer and etc.

Just want to tell "non-sticky" does indeed work (and it will load session from persistence storage at the beginning of each UIDL, even in a single-member cluster).

However, I am waiting for surprises... such as with "equals()" method implementation...

Thanks


Legioth commented 6 years ago

In addition to the pre-6.x era discussion imported here, there's one additional opportunity in Flow. Each URL as managed through Flow's router can work as its own standalone "checkpoint", and we would only have to track what has happened since the last time a checkpoint was entered.

knoobie commented 4 years ago

After your plans for "CCDM" in V15. Is this ticket still "Not Planned"?

Legioth commented 4 years ago

The new client-centric functionality that is being built in the ccdm branch is indeed also targeting stateless servers-side logic, but the approach for doing that is completely different than what's described here.

The idea in the ccdm branch is that the applications UI logic is implemented as client-side TypeScript code and communication with server-side business logic implemented in Java is handled automatically.

The idea described here is that the entire application would use server-side Java for the UI logic. The only difference compared to a typical Vaadin application is that the component tree would be serialized sent to the client in each response and sent back to the server in each request instead of being stored on the server between requests.

knoobie commented 4 years ago

Thanks @Legioth! I'm looking forward to the changes in the ccdm branch. A stateless architecture for Vaadin would be a huge improvement in the current time with containerizing and microservices everywhere.

pleku commented 4 years ago

My expectation is that this will never happen for the Java driven UIs now that we have the hybrid approach with Java+TS that can be made stateless (soon). So basically being stateless will not be done for Java driven UIs. For those who require stateless mode, they should use the Java+TS approach instead, available since Vaadin 15.

If I've understood correctly, the Java+TS approach is not yet fully stateless due to some access token that is in server side state, and thus would need some extra work but it is not yet planned when that might happen.

I'll keep the issue open but I want to make it clear that there are no plans for doing the following for Java driven UIs

The idea described here is that the entire application would use server-side Java for the UI logic. The only difference compared to a typical Vaadin application is that the component tree would be serialized sent to the client in each response and sent back to the server in each request instead of being stored on the server between requests.

Legioth commented 4 years ago

To clarify, the Java+TS approach currently uses the servlet session for the purpose of keeping track of authentication (through Spring Security) and for storing a per-session CSRF token.

The plan for the future for Java+TS is to support using JWT for authentication so that a proof of identity would be included in each request. At the same time, this would remove the need for a CSRF token since authentication wouldn't be based on a session cookie that would be automatically included by the browser in all requests.

It should be possible to manually emulate the same mode of operation in an application by marking all endpoints as @AnonymousAllowed and then manually including a custom authentication token as a parameter in each request.

platosha commented 2 years ago

The original issue concerns Java UI state, so let me transfer it back to the Flow repo.

Meanwhile, the Hilla framework (the former Java endpoints + TypeScript client-side UI approach mentioned above) already supports stateless authentication, see https://hilla.dev/docs/security/spring-stateless