LionWeb-io / lionweb-repository

Reference implementation of LionWeb repository
Apache License 2.0
2 stars 1 forks source link

Add history to the repository #55

Closed joswarmer closed 1 month ago

joswarmer commented 2 months ago

The lionweb-repository now supports history, where old versions of nodes are kept around. Uses views and triggers top hide the history management from the server using it.

enikao commented 1 month ago

Have you thought about using PreparedStatement or ParameterizedQuery instead of string concatenation for queries? Strings are virtually guaranteed to attract Bobby Tables.

joswarmer commented 1 month ago

Have you thought about using PreparedStatement or ParameterizedQuery instead of string concatenation for queries? Strings are virtually guaranteed to attract Bobby Tables.

I did think about it and Meinte also mentioned this. I am using the pg-promise package to build queries most of the time, this should be safe against attacks. However, depending on a library is not as safe as using parameterized queries.

I think things are fine for now, developing is easier and more flexible this way. However, when someone wants to use the repository in production, we might need to change some or all SQL to using PreparedStatements or ParameterizedQueries.

dslmeinte commented 1 month ago

Using PreparedStatements and ParametrizedQuerys might also have some performance benefits, although having the right indices probably brings a lot more.

joswarmer commented 1 month ago

Using PreparedStatements and ParametrizedQuerys might also have some performance benefits, although having the right indices probably brings a lot more.

There are quite some ways to optimize performance, especially in the bulk load scenario, see https://www.postgresql.org/docs/current/populate.html.

Also adding the option to not use history tables might be good for performance in case you do not need history.

And splitting the current history table that contains all rows including historical ones into two separate tables, one for current data, one for all historical data, could also improve performance if your main usage is on the current state.

Maybe we should create two issues:

joswarmer commented 1 month ago

Input parameters like node id's are also validated to conform to the LionWeb format, so they cannot be use in SQL attacks.