Open amizeranschi opened 2 years ago
Thanks for the reply @mjsteinbaugh
I'm attaching the file you requested: tx2gene.csv
: tx2gene.csv
I'm also attaching a file with the commands I used to set up bcbio, to download the data and to set up the bcbio runs.
The relevant lines for this analysis are 115-166 (downloading the data) and 206-235 (setting up and running the analysis).
Hope this helps.
OK great thanks, I'll work on a fix for this over the weekend and will be in touch soon with an update.
OK this tx2gene issue with the sacCer3 genome should be fixed by the pending update to r-acidgenomes
0.2.20. I'm working on pushing this to bioconda today.
See relevant code change here: https://github.com/acidgenomics/r-acidgenomes/blob/main/R/AllClasses.R#L864
You can check this with your install here:
packageVersion("AcidGenomes")
## 0.2.20
library(AcidGenomes)
tx2gene <- importTx2Gene(
file = pasteURL(
"github.com",
"bcbio",
"bcbio-nextgen",
"files",
"7739401",
"tx2gene.csv",
protocol = "https"
)
)
print(tx2gene)
## Tx2Gene with 7036 rows and 2 columns
## txId geneId
## <character> <character>
## 1 ETS1-1_rRNA ETS1-1
## 2 ETS1-2_rRNA ETS1-2
## 3 ETS2-1_rRNA ETS2-1
## 4 ETS2-2_rRNA ETS2-2
## 5 HRA1_ncRNA HRA1
## ... ... ...
## 7032 YPR202W_mRNA YPR202W
## 7033 YPR203W_mRNA YPR203W
## 7034 YPR204C-A_mRNA YPR204C-A
## 7035 YPR204W_mRNA YPR204W
## 7036 ZOD1_ncRNA ZOD1
thanks @mjsteinbaugh! I've pinned it in cloudbiolinux: https://github.com/chapmanb/cloudbiolinux/blob/master/contrib/flavor/ngs_pipeline_minimal/packages-conda.yaml#L280
@amizeranschi please let us know if it works for you.
Hello,
Thanks for looking into this. I upgraded bcbio and tools to latest development and launched R from the directory ${bcbio_dir}/anaconda/envs/rbcbiornaseq/bin
and the commands mentioned by @mjsteinbaugh above ran successfuly. AcidGenomes v. 0.2.20 seems to be available.
However, the bcbio analysis still ended up crashing, this time due to the version of a different package:
[2022-01-09T13:21Z] multiprocessing: run_bcbiornaseqload
[2022-01-09T13:21Z] Loading bcbioRNASeq object.
[2022-01-09T13:21Z] Loading required package: basejump
[2022-01-09T13:21Z] Error: package or namespace load failed for ‘basejump’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
[2022-01-09T13:21Z] namespace ‘AcidSingleCell’ 0.1.8 is being loaded, but >= 0.1.9 is required
[2022-01-09T13:21Z] Error: package ‘basejump’ could not be loaded
[2022-01-09T13:21Z] Execution halted
[2022-01-09T13:21Z] Uncaught exception occurred
Traceback (most recent call last):
File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 26, in run
_do_run(cmd, checks, log_stdout, env=env)
File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 106, in _do_run
raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/home/user/bcbio-nextgen/anaconda/envs/rbcbiornaseq/bin/Rscript --vanilla /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/bcbioRNASeq/load_bcbioRNAseq.R
Loading required package: basejump
Error: package or namespace load failed for ‘basejump’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
namespace ‘AcidSingleCell’ 0.1.8 is being loaded, but >= 0.1.9 is required
Error: package ‘basejump’ could not be loaded
Execution halted
' returned non-zero exit status 1.
Traceback (most recent call last):
File "/home/user/bcbio-nextgen/anaconda/bin/bcbio_nextgen.py", line 245, in <module>
main(**kwargs)
File "/home/user/bcbio-nextgen/anaconda/bin/bcbio_nextgen.py", line 46, in main
run_main(**kwargs)
File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/pipeline/main.py", line 50, in run_main
fc_dir, run_info_yaml)
File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/pipeline/main.py", line 91, in _run_toplevel
for xs in pipeline(config, run_info_yaml, parallel, dirs, samples):
File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/pipeline/main.py", line 290, in rnaseqpipeline
run_parallel("run_bcbiornaseqload", [sample])
File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
return run_multicore(fn, items, config, parallel=parallel)
File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
for data in joblib.Parallel(parallel["num_jobs"], batch_size=1, backend="multiprocessing")(joblib.delayed(fn)(*x) for x in items):
File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 1048, in __call__
if self.dispatch_one_batch(iterator):
File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 866, in dispatch_one_batch
self._dispatch(tasks)
File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 784, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
result = ImmediateResult(func)
File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 572, in __init__
self.results = batch()
File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 263, in __call__
for func, args, kwargs in self.items]
File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/joblib/parallel.py", line 263, in <listcomp>
for func, args, kwargs in self.items]
File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/utils.py", line 59, in wrapper
return f(*args, **kwargs)
File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/distributed/multitasks.py", line 92, in run_bcbiornaseqload
return bcbiornaseq.make_bcbiornaseq_object(*args)
File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/rnaseq/bcbiornaseq.py", line 31, in make_bcbiornaseq_object
do.run([rcmd, "--vanilla", r_file], "Loading bcbioRNASeq object.")
File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 26, in run
_do_run(cmd, checks, log_stdout, env=env)
File "/home/user/bcbio-nextgen/anaconda/lib/python3.7/site-packages/bcbio/provenance/do.py", line 106, in _do_run
raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command '/home/user/bcbio-nextgen/anaconda/envs/rbcbiornaseq/bin/Rscript --vanilla /home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/bcbioRNASeq/load_bcbioRNAseq.R
Loading required package: basejump
Error: package or namespace load failed for ‘basejump’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
namespace ‘AcidSingleCell’ 0.1.8 is being loaded, but >= 0.1.9 is required
Error: package ‘basejump’ could not be loaded
Execution halted
' returned non-zero exit status 1.
You're seeing this error because the conda environment solver isn't working correctly. We should be installing these versions:
r-acidgenomes 0.2.20 r41hdfd78af_0 bioconda
r-acidexperiment 0.2.2 r41hdfd78af_0 bioconda
r-acidsinglecell 0.1.9 r41hdfd78af_0 bioconda
r-basejump 0.14.23 r41hdfd78af_0 bioconda
r-bcbiobase 0.6.22 r41hdfd78af_0 bioconda
r-bcbiornaseq 0.3.44 r41hdfd78af_0 bioconda
@naumenko-sa
Could you pin these package versions as well in cloudbiolinux?
I've added https://github.com/chapmanb/cloudbiolinux/blob/master/contrib/flavor/ngs_pipeline_minimal/packages-conda.yaml#L281, please try again!
Thanks, but that doesn't seem to be ebough to get everything installed as it should. You might have to pin all the 6 package versions mentioned above.
Error: package or namespace load failed for ‘bcbioRNASeq’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
namespace ‘bcbioBase’ 0.6.21 is being loaded, but >= 0.6.22 is required
Execution halted
' returned non-zero exit status 1.
please try again, I hope we got all of them now
Thanks a lot. We're definitely making progress.
This time, bcbiornaseq
complains about the ref-transcripts.gtf
file for sacCer3:
🧪 ### featureCounts
→ Importing aligned counts from featureCounts.
→ Importing '/home/user/bcbio-runs/rna-seq/rna-seq-analysis/final/2022-01-11_rna-seq-analysis/featureCounts/combined.counts' using data.table::`fread()`.
🧪 ## Feature metadata
bcbio GTF file:
/home/user/bcbio-nextgen/genomes/Scerevisiae/sacCer3/rnaseq/ref-transcripts.gtf
→ Making <GRanges> from GFF file ('ref-transcripts.gtf').
→ Getting GFF metadata for 'ref-transcripts.gtf'.
Error: Failed to detect provider (e.g. "Ensembl") from 'ref-transcripts.gtf'.
Backtrace:
█
1. └─bcbioRNASeq::bcbioRNASeq(...)
2. └─AcidGenomes::makeGRangesFromGFF(...)
3. └─AcidGenomes:::.makeGRangesFromRtracklayer(...)
4. └─AcidGenomes::getGFFMetadata(file)
5. └─AcidCLI::abort(...)
6. └─cli::cli_abort(x)
Execution halted
' returned non-zero exit status 1.
I have checked now and that GTF file doesn't have any header. It was installed as part of the sacCer3 genome by bcbio.
Thanks @amizeranschi can you post that GTF file so I can take a look and work on a fix?
Sure thing, here you go. I changed the extension to .txt so that GitHub would accept it.
OK this appears to be fixed in the development version of r-acidgenomes
, which is not yet suitable for deployment on bioconda just yet. I'll post an update when I finish rolling out a stable release supporting this fix.
@mjsteinbaugh
Would you consider adding support in bcbiornaseq for differential affinity in ChIP-seq and ATAC-seq peaks? Given that bcbio produces consensus peaks and computes read counts, these could be used in DESeq2 exactly like in the RNA-seq scenario.
https://bcbio-nextgen.readthedocs.io/en/latest/contents/atac.html#differential-affinity-analysis
@amizeranschi OK I think this should be fixed on bioconda.
Hello, I'm getting a similar error trying to install trinity by conda: ERROR conda.core.link:_execute(730): An error occurred while installing package 'bioconda::bioconductor-go.db-3.14.0-r41hdfd78af_0'. I've tried with many conda versions but the error persist: What can I do to fix it?
Hi @Melisa-Magallanes thanks for the update -- I'll try clean installing bcbio and see if I can reproduce
@Melisa-Magallanes
Just in case your error is similar to what I've been seeing (post-link script failed for package bioconda::bioconductor-go.db-3.14.0-r41hdfd78af_0
), then know that this is a relatively common problem now and it's being addressed.
Have a look here: https://github.com/bioconda/bioconda-recipes/issues/36499#issuecomment-1217214789
Thanks for the update! I'll see if we can come up with a fix in bioconda-recipes.
Great, thanks a lot. Please have a look at bioconductor-org.hs.eg.db
as well. I've been getting a similar error with it while attempting to install bcbio.
Edit: I've submitted a couple of pull requests. https://github.com/bioconda/bioconda-recipes/pull/36554 https://github.com/bioconda/bioconda-recipes/pull/36555
@amizeranschi r-bcbiornaseq
has been updated to 0.5.1 on bioconda. I'm working on updating this in the main bcbio-nextgen
install with @naumenko-sa
Thanks @mjsteinbaugh !
Hello!
I'm trying to run a bulk RNA-seq analysis using the following template:
However, this ends with the following error:
This is strange to see, because the package does seem to be installed in the
rbcbiornaseq
environment: