Open jkanche opened 4 months ago
We've only ever done RStudio containers in the past, but happy to do the manual work to support jupyter. Please let me know the container, port at which jupyter is exposed, and name/description for your workshop, and I'll do my best.
I'll do my best to get it up by tomorrow, but given the tight deadline and conference starting in a few days, if not done by tomorrow, I'd encourage you also plan for a backup such as Colab, but let's try using the platform first.
Thank you @almahmoud, I would like to use https://jupyter-docker-stacks.readthedocs.io/en/latest/ and it runs on port 8888
Hey @jkanche, you pointed to the general jupyter docker docs, not to a specific container. Please let me know which of the many jupyter containers you'd want to use, or better yet, if you can create a custom container with your packages etc pre-installed on top of that general jupyter container, that'd be even better. If you don't have experience using Docker and/or don't need a specific container just any Jupyter environment, please let me know what Pypi/conda/R packages you need and I'll try to build it on my side for you. Also, do you need a jupyter container with both R and python kernel, or just python kernel?
Hi @almahmoud , thank you so much for helping me out here. I tried to use the bioconductor_docker:devel to create an image but ran into many issues. It would be super helpful to create one with both the python and R kernel. I have the packages listed in the workshop repository:
python dependencies: https://github.com/BiocPy/BiocWorkshop2024/blob/master/requirements.txt R dependecies: https://github.com/BiocPy/BiocWorkshop2024/blob/master/rpackages.R
If having both R and Python is too much trouble, just a simple jupyter image with the Python packages installed would also be very helpful. I would really appreciate any help here.
quick update, I was able to publish an image containing the notebook and the relevant python packages to github registry: https://github.com/BiocPy/BiocWorkshop2024/pkgs/container/biocworkshop2024%2Fbuilder and the dockerfile used for the build - https://github.com/BiocPy/BiocWorkshop2024/blob/master/Dockerfile
Jupyter notebook runs on post 8889. It has tokens, do you know if there's a way to disable token based authentication?
Hey @jkanche I actually made a container for you already, and it's not deployed to the instance at workshop.bioconductor.org . Please try it out and let me know if it works
Awesome, thank you so much. I am on a password screen, do you know what the default password is?
Sorry about that, the startup command didn't take effect as expected the first time, try again now, there should be no password, and you should have both R and python kernels. Here is my simple Dockerfile:
FROM jupyter/r-notebook:r-4.3.1
USER root
RUN apt update -qq && apt install python3-dev build-essential -y && curl -O https://raw.githubusercontent.com/Bioconductor/bioconductor_docker/devel/bioc_scripts/install_bioc_sysdeps.sh && bash install_bioc_sysdeps.sh 3.18 && pip install -r <(curl -s https://raw.githubusercontent.com/BiocPy/BiocWorkshop2024/master/requirements.txt) && curl -s https://raw.githubusercontent.com/BiocPy/BiocWorkshop2024/master/rpackages.R | Rscript -
I used the latest available R notebook to make the jupyter setup easiest, but that means you have to use Bioc 3.18
and R 4.3.1. Lmk if that's an issue I can try to make an updated container
@almahmoud thank you so much. I had a couple of issues during this session 1) having file permissions issues when packages download something and 2) sqlite version shipped in the container is too old.
(2) can be fixed by
# Download and build SQLite3 from source
RUN wget --no-check-certificate https://www.sqlite.org/2024/sqlite-autoconf-3450300.tar.gz && \
tar -xvf sqlite-autoconf-3450300.tar.gz && \
cd sqlite-autoconf-3450300 && \
./configure && \
make && \
make install && \
export PATH="/usr/local/lib:$PATH" && \
cd .. && \
rm -rf sqlite-autoconf-3450300.tar.gz sqlite-autoconf-3450300
# Set environment variable for LD_LIBRARY_PATH
ENV LD_LIBRARY_PATH=/usr/local/lib
do you know whats causing (1)?
I can modify container, thank you for providing the commands! Re 1) Are you writing to /home/jovyan
? I believe the default working directory might have permission issues as I didn't account for the user in the jupyter container on the NFS, but you shouldn't need that anyway. If you also can't write to /home/jovyan/
lmk and if you can provide a reproducible example that'd be really helpful too
I was running this chunk from the container @ notebook/genomic_ranges.ipynb, which downloads the bed file to the current working directory.
from geniml.bbclient import BBClient
bbclient = BBClient(cache_folder="cache", bedbase_api="https://api.bedbase.org")
bedfile_id = "be4054acf6e3feeb4dc490e6430e358e"
bedfile = bbclient.load_bed(bedfile_id)
peaks = bedfile.to_granges()
filter_chr22 = [x == "chr22" for x in peaks.get_seqnames()]
peaks_chr22 = peaks[filter_chr22]
print(peaks_chr22)
@jkanche Thanks for the details! That was my bad, I forgot to chown the git directory since it's being cloned as root at startup. It should be fixed now, and container updated! Let me know if you encounter any other issues!
@almahmoud Thank you, this resolves the directory issue. Is there any way we can update the sqlite version in the container. It needs a newer version that the one available through the distros - https://github.com/Bioconductor/workshop-contributions/issues/89#issuecomment-2240163149
Hey @jkanche, are you not seeing the updated sqlite version? I ran your command from above and updated the container already.
The notebook says the sqlite version is 3.43 instead of 3.45. I'm checking to see if there's another env variable i should be setting
I'm currently running the command as root, when you tried that installation command did you run as jovyan
within the container or also ran as root?
seems like the notebooks are run as jovyan
if you are testing this, running section 1.1 from annotate cell types notebook, should give you the list of datasets.
Right now its an error, had the same issue before so i know its sqlite version
I had not tested anything, simply added your sqlite upgrade suggestion, assuming you had tested that and seen it work. I have now added conda update -y -c conda-forge libsqlite
instead which actually updates the version of sqlite you see in python. Try it out now, looking quickly in the container, I see:
>>> import sqlite3
>>> sqlite3.sqlite_version
'3.46.0'
Trying it out in the notebook, seems to work
awesome! thank you very much!
Hi @almahmoud, I am trying to build a docker image and register both R and Python kernels. Does the version you published to workshop.bioconductor.org look something like this ?
https://github.com/BiocPy/BiocWorkshop2024/blob/master/Dockerfile.bioc
Hi, I am presenting a workshop next week on BiocPy:interoperability between R and Python.
Most of the content is in Python, so folks will be following along using Jupyter that already contains all the necessary packages. Do I provide a docker image with Jupyter notebook and the packages preinstalled? How do i do this?
I currently use quarto to publish the tutorial website and is hosted here: https://github.com/BiocPy/BiocWorkshop2024