How to get ps info as a data.frame?

pjastam commented 4 years ago

If I want to stop a running container, then I need the container ID as an argument in the docklet_stop() function. This container ID can be found in the screen output produced by this command:

droplet_id %>% docklet_ps(all = TRUE)

Subsequently, we can fill out the container ID and stop the container as follows:

droplet_id %>% docklet_stop(container = c("container ID"))

However, instead of doing this interactively, I would like to save the docklet_ps() screen output as a data.frame, with the container ID as one of the stored variables, and subsequently call the container ID variable as an argument in the docklet_stop() function. How do I do that? Or is there an alternative route? Many thanks for your thoughts!

sckott commented 4 years ago

Thanks for the issue @pjastam

All docklet_/droplet_ fxns are built to be pipeable to make that workflow easier. A downside of that is they always return the droplet object. So it would take a new function to return the actual output of the docker command.

On a branch I have a thing you can try, remotes::install_github("sckott/analogsea@return-data"), restart R, then see droplet_ssh_data function. You have to give the full docker command. It returns a string of the output of the docker command given.

d %>% droplet_ssh_data("docker ps -a")

it's not easy to parse the output of docker commands.

It's not clear how we'd incorporate this ability to return data instead of the droplet itself as it would break any workflows. Maybe a parameter in each function could toggle the behavior

pjastam commented 4 years ago

Thanks @sckott for your very clear explanation of the cause of the problem and also for your effort to find a solution anyhow.

When I follow your suggestions, I encounter an error when calling the new function droplet_ssh_data. Here is my workflow:

I removed the analogsea package from my RStudio install, and restarted RStudio
remotes::install_github("sckott/analogsea@return-data")
library(analogsea)
d <- droplet(id) #with the id of my test droplet
d %>% droplet_ssh_data("docker ps -a")

This returns: "Error in droplet_ssh_data(., "docker ps -a") : could not find function "droplet_ssh_data""

sessionInfo(): R version 3.6.1 (2019-07-05) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18363)

Any thoughts on what I am doing wrong?

sckott commented 4 years ago

sorry about that, forgot to update namespace, reinstall and try again

pjastam commented 4 years ago

It works!

The last piece of the puzzle is to get the raw OUTPUT from ssh_droplet_data into a dataframe. As you already noted in a previous comment, it's not easy to parse the output of docker commands however.

My thoughts were to add a comma separator to the raw OUTPUT and then use some command read.table(text = OUTPUT,header=TRUE,sep=",",strip.white=TRUE) to do the transformation to a dataframe. This looks a bit like your your comment lines 186-187 in droplet_ssh_data. But I suppose that adding comma separators to the raw OUTPUT at the right places is the difficult part here?

sckott commented 4 years ago

There is the --format flag for docker ps https://docs.docker.com/engine/reference/commandline/ps/#formatting - havne't played with it too much yet.

pjastam commented 4 years ago

Using the --format flag an easy solution might be

container_id <- d %>% droplet_ssh_data("docker ps --format 'table {{.ID}}\t'") container_id <- read.table(text = container_id, header = TRUE, sep = "\t")$CONTAINER.ID

Now this container_id can be used in commands like d %>% docklet_stop(container = container_id), for example.

A few notes:

I added the table option to the droplet_ssh_data() argument in order to get the variable name CONTAINER ID as a header. Note that this code gives the right container_id under the assumption that there is only 1 active container on the droplet.
If there are multiple active containers on the droplet, then one has to add the name of the repo and/or image to the docker ps argument in droplet_ssh_data() to be able to filter the intended container_id. Such a docker_ps argument might look like docker ps --format 'table {{.ID}}\t{{.Names}}\t{{.Image}}, for example. If this is a valid user case, we should explore it further.
The variable name CONTAINER ID in the table that results from the first line of code contains a space (!) between CONTAINER and ID. In order to make sure that CONTAINER ID is read as 1 variable name in the second line of code, the escape character \t was used in both lines of code above.
The resulting container_id is a factor by default.

pjastam commented 4 years ago

Use case: multiple active containers

container_id <- d %>% droplet_ssh_data("docker ps --format 'table{{.ID}},{{.Image}},{{.Names}}'") container_id <- read.table(text = container_id, header = TRUE, sep = ",") %>% dplyr::filter(IMAGE == repo) %>% dplyr::select(CONTAINER.ID)

Note that:

The comma separator is used instead of \t. This solved the problem caused by the space between CONTAINER and ID in the header (see my previous comment).
I set IMAGE equal to repo in the filter command. This repo is a character string that is identical to the repo argument in the docklet_run command that is used to start running the container.
The resulting container_id is a factor by default.

sckott commented 4 years ago

Do you think droplet_ssh_data is enough to do what you want to do? Or do you want more functionality added into this package for these use cases?

pjastam commented 4 years ago

droplet_ssh_data is enough to do what I want to do, thanks for your effort @sckott

For those interested in my use case, I slightly updated/improved my function call to droplet_ssh_data as follows:

container_id <- d %>% 
  droplet_ssh_data("docker ps --format 'table{{.ID}},{{.Image}},{{.Names}}'")

container_id <- read.table(text = container_id, header = TRUE, sep = ",", stringsAsFactors = FALSE) %>%
  dplyr::filter(IMAGE == repo) %>%
  dplyr::pull(CONTAINER.ID)

sckott commented 4 years ago

Thanks @pjastam for raising this issue, glad we sorted out something that works for you

pachadotdev / analogsea

How to get ps info as a data.frame? #196