Closed rothnic closed 1 year ago
Do we need a how to what to do with own data after update to this version?
Do we need a how to what to do with own data after update to this version?
So, what it does is copies the files from /src/data
, to ~/.cassandra
on startup. So, if you were to update the app like you normally do, it should work. However, they will need to know that the files are now located in that folder, or how to change that to a different location.
The one thing missing from this PR is updates to the readme. I was kind of unsure how to update it, given it is in german at the moment. I'd be happy to update the sections needing updating in english if that works.
Some examples of executing the app:
Run the app with default configuration:
python app.py start
See the input options for starting the app:
> python app.py start --help
Usage: app.py start [OPTIONS]
Start the CaSSAndRA Server
Only some Dash server options are handled as command-line options. All other
options should use environment variables. Find supported environment
variables here: https://dash.plotly.com/reference#app.run
Options:
-h, --host TEXT [default: 0.0.0.0]
-p, --port INTEGER [default: 8050]
--proxy TEXT format={{input}}::{{output}} example=http://
0.0.0.0:8050::https://my.domain.com
--data_path TEXT [default: /Users/nroth/.cassandra]
--debug Enables debug mode for dash application
--app_log_level [DEBUG|INFO|WARN|ERROR|CRITICAL]
[default: DEBUG]
--app_log_file_level [DEBUG|INFO|WARN|ERROR|CRITICAL]
[default: DEBUG]
--server_log_level [DEBUG|INFO|WARN|ERROR|CRITICAL]
[default: ERROR]
--pil_log_level [DEBUG|INFO|WARN|ERROR|CRITICAL]
[default: WARN]
--help Show this message and exit.
Build docker image:
docker build . -t cassandra
Run the docker image (simple example, ok for macos which handles file permissions through docker for desktop):
docker run -it -v /Users/nroth/.cassandra:/home/cassandra/.cassandra cassandra start --help
docker run -it -v /Users/nroth/.cassandra:/home/cassandra/.cassandra cassandra start
Run the docker image (with user id mapping, needed for linux machines)
export HOST_UID=$(id -u)
export PGID=$(id -g)
docker run -it --rm -e HOST_UID=$HOST_UID -e HOST_GID=$HOST_GID -v /Users/nroth/.cassandra:/home/cassandra/.cassandra cassandra start
Just realized that I forgot to set the default log level for the cassandra.log file to the same level as before. Going to fix that and push that change in.
Some questions about the changes:
After merging, app can be only start with "start" attribute, correct? If so, we need definitely a README update before merging
How can I start the app in VSCode? Only from cmd line with start attribute, play button in VSCode leads to default print out, correct? How can I use VSCode built in debug mode?
I didn't unterstand what happended with data directroy is that automaticly moved to new location? (/home/user/.cassandra/data...) ---> Edit: I think, I know what happens. The magic is in /src/backend/data/utils.py. Does it mean, I can remove /src/data directory from repository?
I know at least one user is using multiple instances of cassandra on one machine, what will happen then? Should he start cassandra with directory and port attributes for second instance? ./app.py --data_path '/home/...' --port XXXX
If we make small change in app.py:
if __name__ == "__main__": start()
the behavior of cassandra is more familiar. And app.py --help is still working. And I can use build in VSCode debug. Why do yo use cli() function?
- After merging, app can be only start with "start" attribute, correct? If so, we need definitely a README update before merging
Yeah, so I set it up that way originally in case we had other commands we wanted to add in the future, or to create other variations of the start command. For example, start_debug
could be setup that handles passing in some of the values you'd want when debugging. Or, if we wanted to run a test suite, that could be handled with python app.py test
.
- How can I start the app in VSCode? Only from cmd line with start attribute, play button in VSCode leads to default print out, correct? How can I use VSCode built in debug mode?
I held back on committing my vscode commands, but that might be useful. Here is my launch.json file that I could commit:
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "App",
"type": "python",
"request": "launch",
"program": "app.py",
"args": ["start"],
"console": "integratedTerminal",
"justMyCode": true,
"cwd": "${workspaceFolder}/CaSSAndRA/",
"python": "${command:python.interpreterPath}"
},
{
"name": "Debug App",
"type": "python",
"request": "launch",
"program": "app.py",
"args": ["start", "--debug"],
"console": "integratedTerminal",
"justMyCode": true,
"cwd": "${workspaceFolder}/CaSSAndRA/",
"python": "${command:python.interpreterPath}"
},
{
"name": "Python: Current File",
"type": "python",
"request": "launch",
"program": "${file}",
"console": "integratedTerminal",
"justMyCode": true,
"python": "${command:python.interpreterPath}"
}
]
}
- I didn't unterstand what happended with data directroy is that automaticly moved to new location? (/home/user/.cassandra/data...) ---> Edit: I think, I know what happens. The magic is in /src/backend/data/utils.py. Does it mean, I can remove /src/data directory from repository?
Correct, .cassandra is now the data directory. However, /src/data does still need to be there for now because it is essentially the initial data for the .cassandra directory. Without it, a brand new run won't have the data required (i think). I'm not sure if the app depends on that initial data being there.
- I know at least one user is using multiple instances of cassandra on one machine, what will happen then? Should he start cassandra with directory and port attributes for second instance? ./app.py --data_path '/home/...' --port XXXX
Correct. Instead of having to sync two instances of the cassandra git project. They would sync one project, then point the second execution to another directory with another port.
BTW, I am good with the changes you made to just default directly into starting the app for now until we need some other command. Just wanted to explain the thought behind how it was setup. If you want to just default to starting the app when you do python app.py
, I can update the readme with some examples to describe different ways of executing the app, including an example running two instances of it.
After thinking about this a bit more, I do think I could improve the command-line output when the user first runs the app. So, we could check to see if data_path
exists when you run python app.py
, then if it doesn't exist, output some information to the command line to tell the user what is going to happen. This could also give them the chance to change the data_path from the default setting if they want.
Something like this (pseudocode):
if (data_path is the default) and (data_path doesn't exist):
# we know this is a first-time start up
# tell the user "You are starting cassandra for the first time. We will copy the initial data files from /src/data to ~/.cassandra, where all settings, logs, and runtime data will be stored".
response = prompt_user("Do you want to continue starting the server using {data_path}?")
# click would collect the response of yes or no from the command line at this point
if response:
# user said they wanted to continue, so we continue starting the server
else:
# do not start the server
# tell the user how they can see the options for starting the server and how they can pass in an alternate data_path
else:
# do nothing because we know we have started the server before
@EinEinfach I updated the readme, made the suggested changes, then added some command line output for people that might just update and run the new version without looking at the docs. Here is what the output looks like if you just run python app.py without having ~/.cassandra
. Afterwards, it will startup like normal.
thx
file_paths is also needed in backendserver.py in stop() function. Should file_paths be a kind of global variable?
file_paths is also needed in backendserver.py in stop() function. Should file_paths be a kind of global variable?
I think backendserver in the end should be a class, but I was trying to avoid doing major refactoring. Essentially, we should create a BackendServer(filepaths=filepaths) or something along those lines, then there would be start/stop methods on the class. That would provide kind of a global access to the filepaths object for the backendserver, without it being a true global variable, which is something I think most people try to avoid.
I was trying to avoid doing major refactoring.
Yes, it will be a major refactoring. But at some point we have to change that, maybe not in the next months, but time will come. I fixed stop() function for the moment. It wasn't to difficult, file_paths was already a global variable.
By the way, I have a question in forum. How to do now a cassandra update if the app is running in docker container
By the way, I have a question in forum. How to do now a cassandra update if the app is running in docker container
There are instructions in the readme that should still apply, even if coming from a previous version. The difference is just where their host directory maps into the container (~/home/cassandra/.cassandra).
The next step will be to setup automatic builds with tagging of versions, so many people won't ever have to checkout the git repo ever, unless they are developing the app. That is more complicated now that dockerhub no longer does that automatically for free, but it looks like GitHub workflows can orchestrate that.
The core change of this PR is the move the app data outside of the app folder. This is best practice for git projects and is essential for using docker. By moving data outside of the app folder, we can much more easily update the app, whether it is ran directly or with docker.
Changes:
~/.cassandra
by defaultapp.py
Closes #42 and #57