vanheeringen-lab / seq2science

Automated and customizable preprocessing of Next-Generation Sequencing data, including full (sc)ATAC-seq, ChIP-seq, and (sc)RNA-seq workflows. Works equally easy with public as local data.
https://vanheeringen-lab.github.io/seq2science
MIT License
155 stars 27 forks source link

Q: Stuck with initialisation of a workflow (atac-seq) #941

Closed WouterVGKULEUVEN closed 1 year ago

WouterVGKULEUVEN commented 1 year ago

Question We are trying to analyze atacseq data from 36 samples. As a trial I wanted to test if this workflow would work on f.e. the first four samples. of which the first three are biological repeats. I can start the workflow, but I simply don't understand the error I get...

What have I tried Here you can find the config and samples file (I had to convert them to .txt to be able to upload it here. I also screenshotted the folder structure. Thanks in advance!!!

Screenshot 2022-12-16 at 15 45 13

seq2science.2022-12-16T153021.745497.log samples.txt

config.txt s://github.com/vanheeringen-lab/seq2science/files/10246660/samples.txt)

Maarten-vd-Sande commented 1 year ago

Thanks for the issue, I'll take a look. Probably after the weekend I have an answer for you

Maarten-vd-Sande commented 1 year ago

Could you post what's in this file: /Users/u0107886/Stefanie/results/log/trackhub/ASM14944v2.index.log

On what kind of computer are you running it? Linux, Mac, or Windows?

WouterVGKULEUVEN commented 1 year ago

ASM14944v2.index.log

Thanks for your very quick response!! We are running it on a Mac, 4 cores. 32gb mom

Maarten-vd-Sande commented 1 year ago

I can't reproduce the error, but I think it's because mac has a different version of grep than linux. Could you add grep to the environment that's needed for the trackhub index? That's file:

/Users/u0107886/miniconda/envs/seq2science/lib/python3.8/site-packages/seq2science/rules/../envs/ucsc.yaml, and then add grep so that it looks like this:

name: ucsc
channels:
  - conda-forge
  - bioconda
  - defaults
dependencies:
  - bioconda::grep=3.4
  - bioconda::ucsc-bedsort=377
  - bioconda::ucsc-bedGraphToBigWig=377
  - bioconda::ucsc-bedToBigBed=377
  - bioconda::ucsc-fatotwobit=377
  - bioconda::ucsc-gtftogenepred=377
  - bioconda::ucsc-genepredtobed=377
  - bioconda::ucsc-hggcpercent=377
  - bioconda::ucsc-ixixx=377
  - bioconda::ucsc-wigtobigwig=377
  - conda-forge::conda-ecosystem-user-package-isolation=1.0

And then run seq2science as you normally would. It should re-install the environment, with GNU grep added, and hopefully this works then :smile:!

Let me know if that solves the issue for you. As a last resort solution you can always set create_trackhub: false so that no ucsc trackhub is made which hopefully run the rest of the pipeline successfully

WouterVGKULEUVEN commented 1 year ago

Hey! Thanks for that! I had to set the create_trackhub to false... The workflow ran, but got stuck near the end... I attached the error log. seq2science.2022-12-22T101835.775224.log

Maarten-vd-Sande commented 1 year ago

It’s the same as https://github.com/vanheeringen-lab/seq2science/issues/939#issuecomment-1341063395 I think. It just so happened we got two Mac users in the span of two weeks!

You either have to update your bedtools environment like the way I say in the link, or turn off the deeptools plots for qc: deeptools_qc: false

I just started my Xmas break, and I will release a new version of seq2science when I’m back with the fixes for both of your issues, so you (and others) don’t have to play with the envs seq2science installs.

WouterVGKULEUVEN commented 1 year ago

Hey!

Thanks already for all your help! to recap: Trackhub is still set to false, deeptools_qc is also set to false. it then runs till the error attached below. I looked for this error in previous issue mentions and found it here https://github.com/vanheeringen-lab/seq2science/issues/625

but I don't see what is changed to overcome this issue.

Kind regards, Wouter seq2science.2023-01-03T143400.407190.log

Maarten-vd-Sande commented 1 year ago

Can you post the log of the actual rule that fails?

/Users/u0107886/Stefanie/results/log/combine_peaks/ASM14944v2-macs2.log

WouterVGKULEUVEN commented 1 year ago

Next time I'll immediately include that!

ASM14944v2-macs2.log

Maarten-vd-Sande commented 1 year ago

That one I actually solved yesterday (I think). At the end of today there will be a new seq2science release. Hopefully everything works for you with that version

Maarten-vd-Sande commented 1 year ago

You can try mamba install seq2science=0.9.7 tomorrow in your seq2science environment, and seq2science should be updated to the newest version. Please let me know if it works then :pray:

Maarten-vd-Sande commented 1 year ago

The issue you had was with conda and the newest numpy version. I've changed the description on which numpy version should be used on the bioconda side, so probably removing the environment and just restarting the run with the current seq2science version also works btw.

rm -r ../miniconda/envs/seq2science/lib/python3.8/site-packages/seq2science/.snakemake/cd6528626806a6e527f5cf7769376ea3_
seq2science run ...
Maarten-vd-Sande commented 1 year ago

I'm assuming this has been resolved, feel free to reopen if not