Open Just08 opened 2 years ago
For motif part ( comment part of my dockerfile modification ) 0 motifs are loads :
[Optional] Reading motifs: GFF
#51 1605.7 00:02:30 Loading PWMs from : /opt/snpEff-4.3T/./data/GRCm38.102/pwms.bin
#51 1605.7 00:02:30 Loading motifs from : /opt/snpEff-4.3T/./data/GRCm38.102/motif.gff
#51 1633.2 00:02:58 Loadded motifs: 0
#51 1633.2 00:02:58 Saving motifs to: /opt/snpEff-4.3T/./data/GRCm38.102/motif.bin
For regulation part, I test :
gunzip ${PACKAGE_DIR}/snpEff-4.3T/data/GRCm38.102/*.gz \
mkdir ${PACKAGE_DIR}/snpEff-4.3T/data/GRCm38.102/regulation.bed \
wget -nv -r -np -nd -A "*.bed.gz" -e robots=off http://ftp.ensembl.org/pub/release-102/regulation/mus_musculus/Peaks/ \
ls *.bed.gz | awk -F"." -v mvCmd='mv "%s" "%s"\n' '{printf mvCmd,$0,"regulation."$3"."$4".bed.gz"}' | sh \
mv regulation.*.bed.gz ${PACKAGE_DIR}/snpEff-4.3T/data/GRCm38.102/regulation.bed/ \
gunzip ${PACKAGE_DIR}/snpEff-4.3T/data/GRCm38.102/regulation.bed/*.bed.gz
But I have the same issue that were reported without any solution : https://github.com/pcingola/SnpEff/issues/304
This is why I can't achieve Building databases. Regulatory and Non-coding part of SnpEff documentation .
I can confirm the bug regarding the regulatory database build. Anyways I found a workaround: Convert the BED to GFF If the format is like this it will just work. Only columns 1,4,5,9 need valid entries. For the attributes only Cell_type seems to be mandatory, but setting name, alias, etc. could possibly be useful somewhen.
chr1 source feature 4426826 4427337 . . . Cell_type=CHD2_CH12_LX__Enriched_Site
All bed files should be combined into a single gff, which can be .gz to save space.
New SNPeff 102 and VEP 102 annotation work . Note that my custom SNPeff custom database with ensembl 102 data not contains regulation and motif databases due to some issue ( comment part of my dockerfile modification for motif part ).