Closed knolleary closed 2 months ago
@joepavitt should this now be on the dev board so it can be in the design stage?
Thanks for checking @hardillb
Assumptions:
Questions:
/data
would be best, but has problems see comment)Research required:
Mounting any volume on /data
(the userDir) would mean that node_modules
would persist across restarts. This would mean that installed nodes would be persistent decreasing start up time, but this would cause problems when stack changes happen, as this could change the NodeJS version and require a rebuild of any native components.
We would also want the mount point to be in the "Current working directory" of the Node-RED process, so that any files created without a fully qualified path end up in the mounted volume. The core Node-RED File nodes have an option that can be set in settings.js
to control this but I don't think any 3rd party nodes honour this setting.
AWS Storge options:
EBS https://aws.amazon.com/ebs/ - block based, need File System on top (but K8s provisioner will format on creation) EFS https://aws.amazon.com/efs/ - self scaling filesystem based FSx https://aws.amazon.com/fsx - not sure this would work for what we want, it's more about a SAN in the cloud by the look of things S3 https://aws.amazon.com/s3 - Object Storage, while it can look like a filesystem, don't think this is what we want
Would having a filesystem (automatically) permit virtual memory and thus improve the memory issues/crashes witnessed while installing?
Mounting any volume on /data (the userDir) would mean that node_modules would persist across restarts. This would mean that installed nodes would be persistent increasing start up time
I thought having persistent FS would decrease start up time? (typo?)
Would having a filesystem (automatically) permit virtual memory and thus improve the memory issues/crashes witnessed while installing?
Not possible, you can't add swap space inside a container
I thought having persistent FS would decrease start up time? (typo?)
Yes typo
Not possible, you can't add swap space inside a container
I remember seeing it was an alpha feature some time back.
seems it is now in beta: https://kubernetes.io/blog/2023/08/24/swap-linux-beta/
Totally happy to be told I am reading the wrong thing about an unrelated subject
No, that is not useful, that is for overall memory usage of the whole node, not on a per pod basis (also off topic for this issue)
Also need to decide on quota implementation, It looks like the smallest increment we can mount is 1GB on AWS.
Need to know where this will sit in the Team/Enterpise level and what happens on migrations between levels in FFC (given the work Nick has had to do for instance sizes being unavailable at higher levels)
We should approach this topic from two perspectives - core app in general and FFC on EKS.
For the first one - the core app (or probably the k8s driver) should use dynamic storage provisioning approach - create a Persistent Volume Claim based on the provided storage class configuration and use it in the deployment definition. As a software provider, we cannot determine each possible storage class. As stated in the linked documentation, the cluster administrator is responsible for creating a storage class that meets the requirements. The name of the storage class should be passed to the application as a configuration parameter.
From the FFC perspective - once the above is implemented, we are limited to EBS and EFS. We should aim to use EFS than EBS, due to the following reasons:
Having all the above in mind, EFS should be our first choice. However, behind EFS there is an NFS protocol. My main concern is its performance. Before making any production-ready decisions I will suggest making strong PoC first.
Although using AWS S3 as an AWS EKS storage is possible via a dedicated CSI driver, we should avoid it since it does not support dynamic provisioning.
References: https://zesty.co/blog/ebs-vs-efs-which-is-right/ https://www.justaftermidnight247.com/insights/ebs-efs-and-s3-when-to-use-awss-three-storage-solutions/
Summary of discussion between @hardillb and myself:
/data/storage
and we'll update nr-launcher to use that as the working directory of the NR process.Exact details TBD, but one option will be for nr-launcher to copy files back from the file-store prior to starting Node-RED the first time it starts up with the new storage option.
We will identify the current list of instances actively using the file store and assess the scale of migration needed. It may be we can apply something more manual at a small scale - although need to consider self-hosted customers who choose to adopt this.
We already provide a storage quota per team type - but that is limited to our File Nodes and has limited uptake (will get exact numbers to back this assertion up)
We have two options:
Ultimately this will be a choice we can make further down the implementation as it will be a final stage configuration to apply to the platform.
The following items need some additional research to ensure we have a scalable solution.
The EFS limits are documented as:
We provide each instance its storage via an access point on a volume, and each EFS volume can accommodate 120 access points - thus we'll have capacity for 120k instances. The volume limit is also one that can be increased on request. We'll need a way to manage the mapping of instance to volume to ensure utilisation.
What is not currently clear is the mount points per VPC limit; does that apply to the underlying nodes or the pods (eg individual NR instances). That is an order-of magnitude difference - and if its the latter, we're already beyond that limit. @hardillb is following up on this via AWS support forums.
Clarifications on the EFS limits:
Looking at what will be needed for AWS EFS with AccessPoints I think we will need 2 separate storage solutions.
@hardillb are you able to provide a rough delivery date for this please?
Assuming testing today goes well, the technical parts are pretty much done, with the exception of how to enforce the quota requirements.
At this time I have no idea how long that will take.
Any updates please @hardillb - release next week, and marketing asking whether this highlight will be delivered
The code changes are up for review
We need to:
Okay, and when will we answer those questions, who is responsible for answering/actioning?
I'll get with @ppawlowski tomorrow to install the EFS driver so it's ready
The question on access was asked higher up and then left, it's a product call on if we make this only available to higher tiers, but the old file storage is currently available to all just with different quota sizes.
The fact we don't have a quota solution for this at the moment may impact the last point.
@hardillb status update please - ready to go for tomorrow?
@joepavitt should be, I need the following reviewing/merging:
I'm finishing off the last of the environment prep at the moment.
@hardillb assuming we can close this out now?
The core feature has been delivered - so yes, I think we can close this off.
There are some residual tasks to complete which we should get raised separately.
Description
We introduced the File Nodes and the File Server as a workaround to the fact our NR instances do not have a persistent file system. This allowed us to provide 'working' File Nodes that were familiar to existing NR users, however they have some significant drawbacks.
sqlite
node provides a super convenient way to store queryable data locally - but the db file has to be on the local disk.Following a number of discussions on this topic, we want to revisit the original decision not to attach persistent storage volumes to our cloud-hosted Node-RED instances.
This is only scoped to the k8s driver in the first instance. Docker will require a different approach and LocalFS already has local file system access.
The goal will be for each instance to have a volume attached with the appropriate space quota applied.
Open questions:
User Value
Prior to this, interaction with cloud-based file storage was only possible using our own, custom, file nodes. This PR will allow any nodes (e.g. ui builder, sqllite) to have file persistence when running on FlowFuse Cloud.
Customer Requests