MarWoes / wg-blimp

wg-blimp: an end-to-end analysis pipeline for whole genome bisulfite sequencing data
GNU Affero General Public License v3.0
27 stars 12 forks source link

Pipeline fails due to Picard installation #18

Closed MarWoes closed 2 years ago

MarWoes commented 2 years ago

Sometimes there is an error when executing wg-blimp because the Picard build seems to pull in some incompatible depedencies. This may be resolved by using picard-slim instead of picard in the snakemake environment.

JakeLehle commented 2 years ago

Hey just a follow up to this so when we set up the wg-blimp environment with $ conda env create environment.yaml

Should we alter the environment.yaml file to install picard-slim instead of picard? Or should we install the anaconda environment with picard as normal and then alter the /wg-blimp/snakemake_wrapper/envs/picard.yaml so that the dependency on that file is now picard-slim, or change both?

I moved all the code over to our school's cluster and I'm getting stuck running the sample data at the picard step with these error messages which I think you were indicated in this thread.

image

image

MarWoes commented 2 years ago

Hi Jake, Thanks for the reply! Unfortunately I have not been able to pinpoint what exactly is breaking where, but for me switching to picard-slim did not work as expected. I'll have a further look at it and hope to have a fix ready within the next week. Best, Marius

JakeLehle commented 2 years ago

Hey Marius,

Yeah I saw running everything through anaconda isn't working right now but I'm sure it's just a small issue. I'll take a look at the picard documentation as well. If I catch anything I'll let you know so you can update the Snakemake file and fix the issue.

Hey but on the bright side. If anyone wants a temporary workaround for this issue right now. The docker image on imimarw/wg-blimp is working! You can run everything on a personal computer with

$ docker pull imimarw/wg-blimp

I was able to pull that container into our school's cluster with singularity and I'm running my full samples right now with no issues so far. Fingers crossed. The Snakefile on that image is a little older and requires an additional TSS file which I downloaded from this link.

https://genome.ucsc.edu/cgi-bin/hgTables?hgsid=1219084465_HRDjHL1J6LoO4NKjTqqdzM3L9RS0&clade=mammal&org=Mouse&db=mm39&hgta_group=genes&hgta_track=wgEncodeGencodeVM27&hgta_table=0&hgta_regionType=genome&position=chr12%3A56%2C741%2C761-56%2C761%2C390&hgta_outputType=primaryTable&hgta_outFileName=tss-GRCm38.csv

If you have the time @MarWoes let me know if that file is the right one. I'm pretty sure but I won't know till the samples finish running.

Also, the Snakefile on the container defaults the working path for the annotation file the cgi, repeat masker, and tss file way out in the /opt/... directory which can't I write into when I'm running everything in singularity. A workaround for this is to make a config file and have the paths to those files all start with ../../../../../../path/to/files by adding in the 6 "../" the file will step backwards all the way back out of the /opt/ dir and then have it move forward to the location of your dir that you have your cgi, repeat masker, and tss files.

If anyone would like I can upload a screenshot of what my config file looks like if anyone needs it as a template.

Best, Jake

MarWoes commented 2 years ago

Hi Jake,

Thanks for your detailed answer! From what I can tell your TSS file should work fine, but as you already pointed out the Docker version is quite old and differs in syntax from the current version. However, the issue with Picard failing was recently fixed, see https://github.com/broadinstitute/picard/issues/1747

I expect the issue to resolve once all the changes are propagated to Bioconda. I'll have a look if I can exclude the broken Picard version from the Snakemake environment.

Best, Marius

MarWoes commented 2 years ago

The broken Picard installation (2.26.5) is now simply excluded from the snakemake environment. This will also allow future Picard installations to be used. Fixed in 2621fd9de4f3a23fff1d02987d4869dd2e86eeaf .

I'm now waiting for the Bioconda Bot to create an auto-bump pull request and then the fix will be rolled out through Bioconda in the following days.

JakeLehle commented 2 years ago

Awesome! I was looking at that broadinstitute/picard#1747 issue last night and was gonna test it out today haha. You beat me to the punch. I was gonna make sure to roll back the picard version to 2.26.4 and I was thinking the version of openjdk might be an issue as well since for the wg-blimp anaconda environment the openjdk version is 11.0.9.1 so I was gonna either try to roll it back to 11.0.2 or push it forward to 13.0.2 to see if that might fix it. Glad this will get fixed soon!

I'm gonna start training up my students in the lab to start running this today.

MarWoes commented 2 years ago

Haha, good timing :)

The changes have now been propagated to Bioconda, I'll close this for the time being and hope that the issue is now resolved (fingers crossed!). I'll close this issue for now, but don't hesitate to re-open if problems with Picard remain!

JakeLehle commented 2 years ago

Pulled the repository and built the anaconda environment from scratch again and everything is working now with the sample data! This issue is resolved.