Closed Al-Murphy closed 3 years ago
Hi Alan,
It is weird that C.bigwig worked but others failed and I never encountered this before. I checked the A,T,G,C bigwig files and the genome ranges are the same (hg19/grch37). Nevertheless, now I also provide the hg38/grch38 version and you can download them here: https://guanfiles.dcmb.med.umich.edu/Leopard/dna_grch38/ I directly generated these bigwig files from hg38 fasta instead of liftovering them. Let me know if they work for you or not.
Thanks, Hongyang
Hi @Hongyang449,
Thank you for the DNA bigwigs from hg38, that is very helpful. I did, however, note that although converting your avg DNASE bigwig to hg38 didn't produce an error, it did produce an error when trying to train the model. Is there any chance you have a hg38 version of the avg.bigwig which is at https://guanfiles.dcmb.med.umich.edu/Leopard/dnase_bigwig/avg.bigwig
Thanks, Alan.
Hi Alan,
I suggest you to generate the average bigwig of your data using this script:
https://github.com/GuanLab/Leopard/blob/master/data/calculate_avg_bigwig.py
You can run it like this:
python calculate_avg_bigwig.py -i INPUT1.bigwig INPUT2.bigwig INPUT3.bigwig -o avg.bigwig -rg grch38
Thanks, Hongyang
Hey Hongyang,
Thanks for the suggestion and I think that probably makes more sense for my application anyway! However, I am still getting the same error when I create this file:
2021-08-06 09:31:03.361496: W tensorflow/core/framework/op_kernel.cc:1763] OP_REQUIRES failed at summary_kernels.cc:242 : Invalid argument: Nan in summary histogram for: UNet/initial_conv_layer/conv1d/kernel_0
Traceback (most recent call last):
File "./train.py", line 174, in <module>
model.fit(dna_dataset_train,
File "/rds/general/user/aemurphy/home/anaconda3/envs/tf2_leopard/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 1145, in fit
File "/rds/general/user/aemurphy/home/anaconda3/envs/tf2_leopard/lib/python3.8/site-packages/tensorflow/python/keras/callbacks.py", line 428, in on_epoch_end
File "/rds/general/user/aemurphy/home/anaconda3/envs/tf2_leopard/lib/python3.8/site-packages/tensorflow/python/keras/callbacks.py", line 2339, in on_epoch_end
File "/rds/general/user/aemurphy/home/anaconda3/envs/tf2_leopard/lib/python3.8/site-packages/tensorflow/python/keras/callbacks.py", line 2398, in _log_weights
File "/rds/general/user/aemurphy/home/anaconda3/envs/tf2_leopard/lib/python3.8/site-packages/tensorflow/python/ops/summary_ops_v2.py", line 930, in histogram
File "/rds/general/user/aemurphy/home/anaconda3/envs/tf2_leopard/lib/python3.8/site-packages/tensorflow/python/ops/summary_ops_v2.py", line 858, in summary_writer_function
File "/rds/general/user/aemurphy/home/anaconda3/envs/tf2_leopard/lib/python3.8/site-packages/tensorflow/python/framework/smart_cond.py", line 54, in smart_cond
File "/rds/general/user/aemurphy/home/anaconda3/envs/tf2_leopard/lib/python3.8/site-packages/tensorflow/python/ops/summary_ops_v2.py", line 852, in record
File "/rds/general/user/aemurphy/home/anaconda3/envs/tf2_leopard/lib/python3.8/site-packages/tensorflow/python/ops/summary_ops_v2.py", line 923, in function
File "/rds/general/user/aemurphy/home/anaconda3/envs/tf2_leopard/lib/python3.8/site-packages/tensorflow/python/ops/gen_summary_ops.py", line 479, in write_histogram_summary
File "/rds/general/user/aemurphy/home/anaconda3/envs/tf2_leopard/lib/python3.8/site-packages/tensorflow/python/ops/gen_summary_ops.py", line 498, in write_histogram_summary_eager_fallback
File "/rds/general/user/aemurphy/home/anaconda3/envs/tf2_leopard/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
tensorflow.python.framework.errors_impl.InvalidArgumentError: Nan in summary histogram for: UNet/initial_conv_layer/conv1d/kernel_0 [Op:WriteHistogramSummary]
To note this doesn't appear when I use the hg19 avg.bigwig file available from your repo. I am creating the average file off the raw bigwigs as opposed to the quantile normalised (which uses liver.bigwig) files, should I be using the quantile normalised files for this step instead? Or do you have any idea what else could be causing the issue?
Thanks, Alan.
Hi Alan,
I think the error is related to missing/nan values (e.g. the tail region of a chromosome) in the bigwig files. I've filled those nan positions with zeros in the hg19 avg.bigwig and the quantile normalization scripts automatically fill nan with zeros. I've updated the calculate_avg_bigwig.py to fix the nan issue. Regarding the avg.bigwig calculation, it's better to use the quantile normalized files to calculate the average.
Thanks, Hongyang
Thanks for all your help, this worked!
Hi,
I am using leopard to train on a large dataset of epigenomic data which is based on hg38. I thus tried to liftover the one-hot encoded DNA bigwigs, liver bigwig and Average bigwig to hg39 from hg19. Unfortunately, the A,T,G bigwigs all failed when using CrossMap. Here is the error message for the T.bigwig file:
I would usually expect this error when I am using a crossmap conversion file which doesn't specify the correct genome build but the one-hot encoded DNA bigwigs are built on hg19 correct?
Oddly, C.bigwig did not fail. Is this something you have encountered? Or is there an alternative way for me to to create the A,T,G,C bigwig files for hg38? Alternatively, if you have a version of A, T, C, G and avg and liver bigwigs based on hg38 that would be great?
Thanks, Alan.