Data of genome annotation from full-stack ChromHMM model trained with 1032 datasets from 127 reference epigenomes. Please refer to the published manuscript at Genome Biology
Data of full-stack genome annotations for reference assemblies hg19 can be found here . Within this folder:
├── hg19_genome_100_browser.bb: big bed file of the hg19's full stack annotation.
This file can only be viewed properly on the ucsc genome browser through the trackhub link:
https://public.hoffman2.idre.ucla.edu/ernst/2K9RS///full_stack/full_stack_annotation_public_release/hub.txt
├── hg19_genome_100_browser.bed.gz: bed.gz version of the bigbed file hg19_genome_100_browser.bb.
This file can be viewed as a custom track on UCSC genome browser
├── hg19_genome_100_segments.bed.gz: bed.gz version of the segmentation with only 4 columns corresponding to
chrom, start, end, full_stack state
├── state_annotation_processed_publish.csv: annotations of the states
(a more complete, excel-format version of this file is Additional File 3 in our published paper)
├── state_annot_README.md: read me file for file state_annotation_processed_publish.csv
└── trackDb.txt
Data of full-stack genome annotations for reference assemblies hg38 can be found here . Full-stack annotations in hg38 were created by lifting-over the annotation from hg10 to hg38. The whole liftOver pipeline's code is provide here . Within this folder:
├── hg38_genome_100_browser.bb: bigbed file of the hg38's full-stack annotation lifted over from hg19.
This file can only be viewed properly on the ucsc genome browser through the trackhub link:
https://public.hoffman2.idre.ucla.edu/ernst/2K9RS///full_stack/full_stack_annotation_public_release/hub.txt
├── hg38_genome_100_browser.bed.gz: bed.gz version fo the bigbed file hg38_genome_100_browser.bb.
This file can be viewed as a custom track on UCSC genome browser
├── hg38_genome_100_segments.bed.gz: bed.gz version of the segmentation with only 4 columns corresponding to
chrom, start, end, full_stack state
├── state_annotations_processed.csv: annotations of the states
(a more complete, excel-format version of this file is Additional File 3 in our published paper)
├── state_annot_README.md: read me file for file state_annotation_processed_publish.csv
└── trackDb.txt
Since the publication of our data, the annotation of full-stack annotation in hg19 have NOT been changed (except for our state names being changed from 0-based indexing system 0_GapArtf1-99_TSS2 to 1_GapArtf1-100_TSS2). The hg38 annotation have been changed in 3 version**:
You can view the full-stack annotations in hg19, hg38 and mm10 (see Mouse's Github ) as a trackhub on UCSC genome browser using the link: https://public.hoffman2.idre.ucla.edu/ernst/2K9RS///full_stack/full_stack_annotation_public_release/hub.txt
Within each subfolders inside this folder, there are readme that can help you understand and apply the code. Note: AF stands for Additional File
All code is provided under the MIT Open Acess License Copyright 2021 Ha Vu and Jason Ernst Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
We tried to commented the code and provide as much details on how to reproduce the results as possible. If you run into problems, please contact Ha Vu (havu73@ucla.edu)