Open gevro opened 1 year ago
From what I can find in your code, it looks like the error is happening here:
.makewide <- function(longdf,type){
dt<-longdf[, list(GEu=unlist(strsplit(GE,","))), by=c("RG","GE",type)][ #this splits multioverlap gene lists by comma
, cnt := get(type) / (stringr::str_count(GE,",")+1) ][
, list(tot=sum(cnt)),by=c("RG","GEu") ]
Though I don't understand how zUMIs works, it seems that GE is a tag loaded from one of the BAM files? However, looking through all the BAM files made by zUMIs, I don't see a GE tag anywhere.
What is causing this bug?
Thanks
Sorry, correction: the .filtered.Aligned.GeneTagged.UBcorrected.sorted.bam file DOES have GE tags. So I'm not sure why this bug is happening. The GE tag should be retrieved and therefore the GE objects should exist when that strsplit(GE,",") function runs.
I think the same error came up recently in another issue, which R version do you use?
It was solved by using the conda environment you get with zUMIs, please try that! zUMIs.sh -c -y config.yaml
R 4.2.2
I saw in another issue on here that R 3.6 fixes this? Is that correct?
I tried '-c' option but that gives even more errors. Conda is not completely isolated from the local environment so I think it still has issues.
Any other suggestions? I think maybe the only other option is docker.
But it would be great to understand why this 'GE' error is happening in the first place. That would fix the issue.
Sorry one more idea: is it possible to run without '-c' and then rerun with '-c' and it will continue from the point that the initial run failed at? That could be a solution. Would the rerun with '-c' detect the prior output files and skip to the last step that has not been completed? That way I can get to the 'GE' error step without '-c' and then rerun with '-c' to complete the pipline.
It looks like the conda environment of zUMIs does not have R installed. So a docker where zUMIs is cloned into does not have R. I'd suggest putting R in the conda environment or making a fully containerized docker with all requirements.
Which R version should I install?
Thanks
Hi,
The conda environment has R preinstalled. If you think it doesn't in your copy, try to clone again from GitHub.
Here is also a full conda export for your reference.
channels:
- davemcg
- ccb-sb
- conda-forge
- bioconda
- defaults
dependencies:
- _libgcc_mutex=0.1=conda_forge
- _openmp_mutex=4.5=1_gnu
- _r-mutex=1.0.1=anacondar_1
- binutils_impl_linux-64=2.36.1=h193b22a_2
- binutils_linux-64=2.36=hf3e587d_0
- bioconductor-annotationdbi=1.48.0=r36_0
- bioconductor-biobase=2.46.0=r36h516909a_0
- bioconductor-biocfilecache=1.10.0=r36_0
- bioconductor-biocgenerics=0.32.0=r36_0
- bioconductor-biocparallel=1.20.0=r36he1b5a44_0
- bioconductor-biomart=2.42.0=r36_0
- bioconductor-biostrings=2.54.0=r36h516909a_0
- bioconductor-delayedarray=0.12.0=r36h516909a_0
- bioconductor-genomeinfodb=1.22.0=r36_0
- bioconductor-genomeinfodbdata=1.2.2=r36_0
- bioconductor-genomicalignments=1.22.0=r36h516909a_0
- bioconductor-genomicfeatures=1.38.0=r36_0
- bioconductor-genomicranges=1.38.0=r36h516909a_0
- bioconductor-iranges=2.20.0=r36h516909a_0
- bioconductor-plyranges=1.6.0=r36_0
- bioconductor-rhtslib=1.18.0=r36hdb70ac9_1
- bioconductor-rsamtools=2.2.0=r36he1b5a44_0
- bioconductor-rsubread=2.0.0=r36h516909a_0
- bioconductor-rtracklayer=1.46.0=r36h516909a_0
- bioconductor-s4vectors=0.24.0=r36h516909a_0
- bioconductor-summarizedexperiment=1.16.0=r36_0
- bioconductor-xvector=0.26.0=r36h516909a_0
- bioconductor-zlibbioc=1.32.0=r36h516909a_0
- bwidget=1.9.14=ha770c72_0
- bzip2=1.0.8=h7f98852_4
- c-ares=1.17.2=h7f98852_0
- ca-certificates=2021.5.30=ha878542_0
- cached-property=1.5.2=hd8ed1ab_1
- cached_property=1.5.2=pyha770c72_1
- cairo=1.16.0=h6cf1ce9_1008
- certifi=2021.5.30=py39hf3d152e_0
- click=8.0.1=py39hf3d152e_0
- coreutils=8.31=h516909a_0
- curl=7.78.0=hea6ffbf_0
- cycler=0.10.0=py_2
- expat=2.4.1=h9c3ff4c_0
- font-ttf-dejavu-sans-mono=2.37=hab24e00_0
- font-ttf-inconsolata=3.000=h77eed37_0
- font-ttf-source-code-pro=2.038=h77eed37_0
- font-ttf-ubuntu=0.83=hab24e00_0
- fontconfig=2.13.1=hba837de_1005
- fonts-conda-ecosystem=1=0
- fonts-conda-forge=1=0
- freetype=2.10.4=h0708190_1
- fribidi=1.0.10=h36c2ea0_0
- gcc_impl_linux-64=9.4.0=h03d3576_8
- gcc_linux-64=9.4.0=h391b98a_0
- gettext=0.19.8.1=h0b5b191_1005
- gfortran_impl_linux-64=9.4.0=h0003116_8
- gfortran_linux-64=9.4.0=hf0ab688_0
- git=2.33.0=pl5321hc30692c_0
- graphite2=1.3.13=h58526e2_1001
- gsl=2.6=he838d99_2
- gxx_impl_linux-64=9.4.0=h03d3576_8
- gxx_linux-64=9.4.0=h0316aca_0
- h5py=3.3.0=nompi_py39h98ba4bc_100
- harfbuzz=2.9.0=h83ec7ef_0
- hdf5=1.10.6=nompi_h6a2412b_1114
- htslib=1.13=h9093b5e_0
- icu=68.1=h58526e2_0
- jbig=2.1=h7f98852_2003
- joblib=1.0.1=pyhd8ed1ab_0
- jpeg=9d=h36c2ea0_0
- kernel-headers_linux-64=2.6.32=he073ed8_14
- kiwisolver=1.3.1=py39h1a9c180_1
- krb5=1.19.2=hcc1bbae_0
- lcms2=2.12=hddcbb42_0
- ld_impl_linux-64=2.36.1=hea4e1c9_2
- lerc=2.2.1=h9c3ff4c_0
- libblas=3.9.0=11_linux64_openblas
- libcblas=3.9.0=11_linux64_openblas
- libcurl=7.78.0=h2574ce0_0
- libdeflate=1.7=h7f98852_5
- libedit=3.1.20191231=he28a2e2_2
- libev=4.33=h516909a_1
- libffi=3.3=h58526e2_2
- libgcc-devel_linux-64=9.4.0=hd854feb_8
- libgcc-ng=11.1.0=hc902ee8_8
- libgfortran-ng=11.1.0=h69a702a_8
- libgfortran5=11.1.0=h6c583b3_8
- libgit2=1.1.1=hee63804_1
- libglib=2.68.4=h3e27bee_0
- libgomp=11.1.0=hc902ee8_8
- libiconv=1.16=h516909a_0
- liblapack=3.9.0=11_linux64_openblas
- libllvm10=10.0.1=he513fc3_3
- libnghttp2=1.43.0=h812cca2_0
- libopenblas=0.3.17=pthreads_h8fe5266_1
- libpng=1.6.37=h21135ba_2
- libsanitizer=9.4.0=h79bfe98_8
- libssh2=1.9.0=ha56f1ee_6
- libstdcxx-devel_linux-64=9.4.0=hd854feb_8
- libstdcxx-ng=11.1.0=h56837e0_8
- libtiff=4.3.0=hf544144_1
- libuuid=2.32.1=h7f98852_1000
- libwebp-base=1.2.1=h7f98852_0
- libxcb=1.13=h7f98852_1003
- libxml2=2.9.12=h72842e0_0
- llvmlite=0.36.0=py39h1bbdace_0
- loompy=3.0.6=py_0
- lz4-c=1.9.3=h9c3ff4c_1
- make=4.3=hd18ef5c_1
- matplotlib-base=3.4.3=py39h2fa2bec_0
- ncurses=6.2=h58526e2_4
- numba=0.53.1=py39h56b8d98_1
- numpy=1.21.2=py39hdbf815f_0
- numpy_groupies=0.9.13=pyh9f0ad1d_1
- olefile=0.46=pyh9f0ad1d_1
- openjpeg=2.4.0=hb52868f_1
- openssl=1.1.1k=h7f98852_1
- pandas=1.3.2=py39hde0f152_0
- pango=1.48.9=hb8ff022_0
- pcre=8.45=h9c3ff4c_0
- pcre2=10.37=h032f7d1_0
- perl=5.32.1=0_h7f98852_perl5
- pigz=2.6=h27826a3_0
- pillow=8.3.1=py39ha612740_0
- pip=21.2.4=pyhd8ed1ab_0
- pixman=0.40.0=h36c2ea0_0
- pthread-stubs=0.4=h36c2ea0_1001
- pyparsing=2.4.7=pyh9f0ad1d_0
- pysam=0.16.0.1=py39h051187c_3
- python=3.9.6=h49503c6_1_cpython
- python-dateutil=2.8.2=pyhd8ed1ab_0
- python_abi=3.9=2_cp39
- pytz=2021.1=pyhd8ed1ab_0
- r-askpass=1.1=r36hcfec24a_2
- r-assertthat=0.2.1=r36h6115d3f_2
- r-backports=1.2.1=r36hcfec24a_0
- r-base=3.6.3=hbcea092_8
- r-base64enc=0.1_3=r36hcfec24a_1004
- r-bh=1.75.0_0=r36hc72bb7e_0
- r-biocmanager=1.30.15=r36hc72bb7e_0
- r-bit=4.0.4=r36hcfec24a_0
- r-bit64=4.0.5=r36hcfec24a_0
- r-bitops=1.0_7=r36hcfec24a_0
- r-blob=1.2.1=r36h6115d3f_1
- r-brew=1.0_6=r36h6115d3f_1003
- r-brio=1.1.2=r36hcfec24a_0
- r-bslib=0.2.5.1=r36hc72bb7e_0
- r-cachem=1.0.5=r36hcfec24a_0
- r-cairo=1.5_12.2=r36hcdcec82_0
- r-callr=3.7.0=r36hc72bb7e_0
- r-cli=2.5.0=r36hc72bb7e_0
- r-clipr=0.7.1=r36h142f84f_0
- r-colorspace=2.0_1=r36hcfec24a_0
- r-commonmark=1.7=r36hcfec24a_1002
- r-covr=3.5.1=r36h03ef668_0
- r-cowplot=1.1.1=r36hc72bb7e_0
- r-crayon=1.4.1=r36hc72bb7e_0
- r-credentials=1.3.0=r36h6115d3f_0
- r-crosstalk=1.1.1=r36hc72bb7e_0
- r-curl=4.3.1=r36hcfec24a_0
- r-data.table=1.14.0=r36hcfec24a_0
- r-dbi=1.1.1=r36hc72bb7e_0
- r-dbplyr=2.1.1=r36hc72bb7e_0
- r-desc=1.3.0=r36hc72bb7e_0
- r-devtools=2.4.1=r36hc72bb7e_0
- r-diffobj=0.3.4=r36hcfec24a_0
- r-digest=0.6.27=r36h03ef668_0
- r-dplyr=1.0.4=r36h03ef668_0
- r-dt=0.18=r36hc72bb7e_0
- r-ellipsis=0.3.2=r36hcfec24a_0
- r-evaluate=0.14=r36h6115d3f_2
- r-extradistr=1.9.1=r36h0357c0b_0
- r-fansi=0.4.2=r36hcfec24a_0
- r-farver=2.1.0=r36h03ef668_0
- r-fastmap=1.1.0=r36h03ef668_0
- r-formatr=1.9=r36hc72bb7e_0
- r-fs=1.5.0=r36h0357c0b_0
- r-futile.logger=1.4.3=r36h6115d3f_1003
- r-futile.options=1.0.1=r36h6115d3f_1002
- r-generics=0.1.0=r36hc72bb7e_0
- r-gert=1.3.0=r36hbd84cd2_0
- r-ggplot2=3.3.3=r36hc72bb7e_0
- r-ggrastr=0.1.7=r36_1000
- r-gh=1.3.0=r36hc72bb7e_0
- r-git2r=0.28.0=r36hf628c3e_0
- r-gitcreds=0.1.1=r36hc72bb7e_0
- r-glue=1.4.2=r36hcfec24a_0
- r-gtable=0.3.0=r36h6115d3f_3
- r-hdf5r=1.3.3=r36h4dd06ac_0
- r-highr=0.9=r36hc72bb7e_0
- r-hms=1.1.0=r36hc72bb7e_0
- r-htmltools=0.5.1.1=r36h03ef668_0
- r-htmlwidgets=1.5.3=r36hc72bb7e_0
- r-httpuv=1.6.1=r36h03ef668_0
- r-httr=1.4.2=r36h6115d3f_0
- r-inflection=1.3.5=r36h6115d3f_0
- r-ini=0.3.1=r36h6115d3f_1003
- r-isoband=0.2.4=r36h03ef668_0
- r-iterators=1.0.13=r36h142f84f_0
- r-itertools=0.1_3=r36_1003
- r-jquerylib=0.1.4=r36hc72bb7e_0
- r-jsonlite=1.7.2=r36hcfec24a_0
- r-knitr=1.33=r36hc72bb7e_0
- r-labeling=0.4.2=r36h142f84f_0
- r-lambda.r=1.2.4=r36h6115d3f_1
- r-later=1.2.0=r36h03ef668_0
- r-lattice=0.20_44=r36hcfec24a_0
- r-lazyeval=0.2.2=r36hcfec24a_2
- r-lifecycle=1.0.0=r36hc72bb7e_0
- r-loomr=0.2.0=r36_0
- r-magrittr=2.0.1=r36hcfec24a_1
- r-markdown=1.1=r36hcfec24a_1
- r-mass=7.3_54=r36hcfec24a_0
- r-matrix=1.3_3=r36he454529_0
- r-matrixstats=0.58.0=r36hcfec24a_0
- r-mclust=5.4.7=r36h52d45c5_0
- r-memoise=2.0.0=r36hc72bb7e_0
- r-mgcv=1.8_35=r36he454529_0
- r-mime=0.10=r36hcfec24a_0
- r-munsell=0.5.0=r36h6115d3f_1003
- r-nlme=3.1_152=r36h859d828_0
- r-openssl=1.4.4=r36he36bf35_0
- r-pillar=1.6.1=r36hc72bb7e_0
- r-pkgbuild=1.2.0=r36hc72bb7e_0
- r-pkgconfig=2.0.3=r36h6115d3f_1
- r-pkgload=1.2.1=r36h03ef668_0
- r-plogr=0.2.0=r36h6115d3f_1003
- r-praise=1.0.0=r36h6115d3f_1004
- r-prettyunits=1.1.1=r36h6115d3f_1
- r-processx=3.5.2=r36hcfec24a_0
- r-progress=1.2.2=r36h6115d3f_2
- r-promises=1.2.0.1=r36h03ef668_0
- r-ps=1.6.0=r36hcfec24a_0
- r-purrr=0.3.4=r36hcfec24a_1
- r-r6=2.5.0=r36hc72bb7e_0
- r-rappdirs=0.3.3=r36hcfec24a_0
- r-rcmdcheck=1.3.3=r36h6115d3f_3
- r-rcolorbrewer=1.1_2=r36h6115d3f_1003
- r-rcpp=1.0.6=r36h03ef668_0
- r-rcurl=1.98_1.3=r36hcfec24a_0
- r-rematch2=2.1.2=r36h6115d3f_1
- r-remotes=2.3.0=r36hc72bb7e_0
- r-rex=1.2.0=r36h6115d3f_1
- r-rlang=0.4.11=r36hcfec24a_0
- r-rlist=0.4.6.1=r36h6115d3f_1003
- r-roxygen2=7.1.1=r36h0357c0b_0
- r-rprojroot=2.0.2=r36hc72bb7e_0
- r-rsqlite=2.2.5=r36h03ef668_0
- r-rstudioapi=0.13=r36hc72bb7e_0
- r-rversions=2.0.2=r36h6115d3f_0
- r-sass=0.4.0=r36h03ef668_0
- r-scales=1.1.1=r36h6115d3f_0
- r-sessioninfo=1.1.1=r36h6115d3f_1002
- r-shiny=1.6.0=r36hc72bb7e_0
- r-shinybs=0.61=r36h6115d3f_1003
- r-shinythemes=1.2.0=r36hc72bb7e_0
- r-snow=0.4_3=r36h6115d3f_1002
- r-sourcetools=0.1.7=r36he1b5a44_1002
- r-stringdist=0.9.6.3=r36hcfec24a_0
- r-stringi=1.6.2=r36hcabe038_0
- r-stringr=1.4.0=r36h6115d3f_2
- r-sys=3.4=r36hcfec24a_0
- r-testthat=3.0.2=r36h03ef668_0
- r-tibble=3.1.2=r36hcfec24a_0
- r-tidyselect=1.1.1=r36hc72bb7e_0
- r-usethis=2.0.1=r36hc72bb7e_0
- r-utf8=1.2.1=r36hcfec24a_0
- r-vctrs=0.3.8=r36hcfec24a_1
- r-viridislite=0.4.0=r36hc72bb7e_0
- r-waldo=0.2.5=r36hc72bb7e_0
- r-whisker=0.4=r36h6115d3f_1
- r-withr=2.4.2=r36hc72bb7e_0
- r-xfun=0.23=r36hcfec24a_0
- r-xml=3.99_0.3=r36hcdcec82_1
- r-xml2=1.3.2=r36h0357c0b_1
- r-xopen=1.0.0=r36h6115d3f_1003
- r-xtable=1.8_4=r36h6115d3f_3
- r-yaml=2.2.1=r36hcfec24a_1
- r-zip=2.1.1=r36hcfec24a_0
- readline=8.1=h46c0cb4_0
- samtools=1.13=h8c37831_0
- scikit-learn=0.24.2=py39h4dfa638_1
- scipy=1.7.1=py39hee8e79c_0
- sed=4.8=he412f7d_0
- setuptools=57.4.0=py39hf3d152e_0
- six=1.16.0=pyh6c4a22f_0
- sqlite=3.36.0=h9cd32fc_0
- star=2.7.3a=0
- sysroot_linux-64=2.12=he073ed8_14
- threadpoolctl=2.2.0=pyh8a188c0_0
- tk=8.6.11=h21135ba_0
- tktable=2.10=hb7b940f_3
- tornado=6.1=py39h3811e60_1
- tzdata=2021a=he74cb21_1
- velocyto.py=0.17.17=py39hcbe4a3b_4
- wheel=0.37.0=pyhd8ed1ab_1
- xorg-kbproto=1.0.7=h7f98852_1002
- xorg-libice=1.0.10=h7f98852_0
- xorg-libsm=1.2.3=hd9c2040_1000
- xorg-libx11=1.7.2=h7f98852_0
- xorg-libxau=1.0.9=h7f98852_0
- xorg-libxdmcp=1.1.3=h7f98852_0
- xorg-libxext=1.3.4=h7f98852_1
- xorg-libxrender=0.9.10=h7f98852_1003
- xorg-libxt=1.2.1=h7f98852_2
- xorg-renderproto=0.11.1=h7f98852_1002
- xorg-xextproto=7.3.0=h7f98852_1002
- xorg-xproto=7.0.31=h7f98852_1007
- xz=5.2.5=h516909a_1
- zlib=1.2.11=h516909a_1010
- zstd=1.5.0=ha95c52a_0
I have confirmed running in base system docker and AWS cloud instances previously.
Best, Christoph
Thanks. I think the issue was #332 . I removed zUMIs-env and now it is working. Now I'll run the pipeline again from docker and hopefully it works.
Hi @gevro , I met the same problem "function 'strsplit': object 'GE' not found". I used my own zUMIs environment with R version 4.3.1. Could you please tell me have you solved this problem?
I wasn't able to figure this out, sorry.
I see. But wandering did you still use it? I mean how you treat this problem finally?
see also #375
Hi, We're getting the below error, near the end of the pipeline. Any idea what the issue is? Thanks