psychoinformatics-de / ClusteredNetwork_pub

0 stars 1 forks source link

Publication outcome dataset #9

Open mih opened 7 months ago

mih commented 7 months ago

Here is a log. For now this is using the pickle-format input data to be able to tackle one issue at a time.

❯ datalad clone git@github.com:nawrotlab/ClusteredNetwork_pub.git rostami_etal_2024
❯ cd rostami_etal_2024
❯ datalad create --force -c yoda .
❯ git mv -f readme.md README.md
❯ datalad save -m "Move README to standard slot"

❯ mkdir container
❯ git mv Dockerfile environment.yml container
❯ cat << EOT > container/.gitattributes
/* annex.largefiles=nothing
EOT
❯ datalad save -m "Consolidate container setup to dedicated directory"

❯ datalad run -m "Build docker image with analysis environment" -i container/Dockerfile -i container/environment.yml -o container/image sh -c "rm -rf container/image; docker build -t clusterednetwork:latest container && python -m datalad_container.adapters.docker save clusterednetwork:latest container/image && echo '**/*json annex.largefiles=nothing\nrepositories annex.largefiles=nothing\n**/VERSION annex.largefiles=nothing' > container/image/.gitattributes"

# this rather complicated call does some of the docker
# environment setup outside, which has the benefit that
# we can run the container as a regular user
# and get files created on the host file system that are
# already owned by the regular user.
# this setup uses a temporary directory on the host system
# as a HOME dir inside the docker for now. This could be moved
# to a place inside the docker system.
❯ datalad containers-add -i container/image --call-fmt "{{python}} -m datalad_container.adapters.docker run {img} bash -c 'mkdir -p /tmp/dockertmp && export HOME=/tmp/dockertmp && . /opt/conda/etc/profile.d/conda.sh && conda activate base && export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib && export CFLAGS=-I/usr/local/include && export LDFLAGS=-L/usr/local/lib && conda activate ClusteredNetwork_pub && {cmd} && rm -rf /tmp/dockertmp'" docker

❯ datalad configuration --scope branch set datalad.run.substitutions.python=python
❯ datalad save -m 'Use the "standard" python from a venv (or alike), if there is need for a different one, this

❯ sed -i 's,^/data,#/data,' .gitignore
❯ datalad save -m 'We want to track the data'

❯ datalad clone -d . https://github.com/psychoinformatics-de/joe_and_lili_pickle.git inputs/joe_and_lili

❯ ln -s inputs/joe_and_lili/data data
❯ datalad save -m 'data shortcut for now'

This is now a dataset that links all study components and has a container registered for code execution

❯ datalad containers-run -n docker bash
[INFO   ] Making sure inputs are available (this may take some time) 
[INFO   ] == Command start (output follows) ===== 
I have no name!@c9a32d180552:/tmp$ ls
CHANGELOG.md  Download.sh  README.md  code  container  data  fig_codes  inputs  utils
I have no name!@c9a32d180552:/tmp$ ls data | head -2
joe011_3.pickle
joe011_4.pickle
mih commented 7 months ago

Container call setup needs to change -- right now it would hide an error exit of the main payload script. Needs to become && all the way through.

Update: Now fixed in the top post.