molikd / otb

Only The Best (Genome Assembly Tools)
Other
5 stars 3 forks source link

generate the hic files #52

Closed molikd closed 2 years ago

molikd commented 2 years ago

yahs juicer_pre for yahs outputs, and juicebox tools juicer pre for fragments list from hicstuff.

Astahlke commented 2 years ago

This is how we create .assembly and .hic for yahs

# -a option creates an assembly that is editable in juicebox
juicer_pre -a -o no_ec no_ec.bin no_ec_scaffolds_final.agp genome.fasta.fai \
        2>tmp_juicer_pre_JBAT.log
grep "PRE_C_SIZE" tmp_juicer_pre_JBAT.log | cut -d' ' -f2- > no_ec_JBAT.chrom.sizes

java -jar -Xmx250G $JB_DIR/juicer_tools_1.22.01.jar pre \
        no_ec_scaffolds.txt no_ec_yahs.out.hic.part no_ec_yahs.JBAT.chrom.sizes  --threads 40
mv no_ec_yahs.out.hic.part no_ec_yahs.out.hic
molikd commented 2 years ago

this requires a Java/Alpine container as base https://hub.docker.com/_/ibmjava

Using 1.22

molikd commented 2 years ago

hicstuff does actually output a juicer ready level file, that's the agb file. The yahs output needs to be converted however, I added in the juicer_pre from hicstuff into the relevant yahs processes in 977018c

we will need the above container for a juicebox run to run juicer pre, basically we need something like:

(java -jar -Xmx32G juicer_tools.1.9.9_jcuda.0.8.jar pre out_JBAT.txt out_JBAT.hic.part <(cat out_JBAT.log | grep PRE_C_SIZE | awk '{print $2" "$3}')) && (mv out_JBAT.hic.part out_JBAT.hic)

This container will need java, and should probably have the xvfb hack we use for genomescope.

Astahlke commented 2 years ago

You don't need xvfb for the juicer_tools jar to produce the hic file.

molikd commented 2 years ago

I was thinking about putting it in so that it'd be a fully functional Juicer Singularity

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

molikd commented 2 years ago

this is in and works.