Closed mcsimenc closed 2 years ago
Did you create the fasta file with: https://github.com/aertslab/create_cisTarget_databases/blob/master/create_fasta_with_padded_bg_from_bed.sh
Can you post the output of the following 2 commands?
file /home/msimenc/analysis/scrnaseq/scenic/create_cisTarget_database/motifs/AP2EREBP_tnt.ERF104_col_a_m1.cb /home/msimenc/analysis/scrnaseq/scenic/create_cisTarget_database/Athaliana.Col-0.HPIv01_10k.promoters.fasta
head -n30 /home/msimenc/analysis/scrnaseq/scenic/create_cisTarget_database/motifs/AP2EREBP_tnt.ERF104_col_a_m1.cb /home/msimenc/analysis/scrnaseq/scenic/create_cisTarget_database/Athaliana.Col-0.HPIv01_10k.promoters.fasta
Thanks for the reply. I didn't use create_fasta_with_padded_bg_from_bed.sh to make the fasta but I will try it.
Here are the outputs:
(create_cistarget_databases) [msimenc@KIWI create_cisTarget_database]$ file /home/msimenc/analysis/scrnaseq/scenic/create_cisTarget_database/motifs/AP2EREBP_tnt.ERF104_col_a_m1.cb /home/msimenc/analysis/scrnaseq/scenic/create_cisTarget_database/Athaliana.Col-0.HPIv01_10k.promoters.fasta
/home/msimenc/analysis/scrnaseq/scenic/create_cisTarget_database/motifs/AP2EREBP_tnt.ERF104_col_a_m1.cb: ASCII text
/home/msimenc/analysis/scrnaseq/scenic/create_cisTarget_database/Athaliana.Col-0.HPIv01_10k.promoters.fasta: ASCII text
(create_cistarget_databases) [msimenc@KIWI create_cisTarget_database]$ head -n30 /home/msimenc/analysis/scrnaseq/scenic/create_cisTarget_database/motifs/AP2EREBP_tnt.ERF104_col_a_m1.cb /home/msimenc/analysis/scrnaseq/scenic/create_cisTarget_database/Athaliana.Col-0.HPIv01_10k.promoters.fasta
==> /home/msimenc/analysis/scrnaseq/scenic/create_cisTarget_database/motifs/AP2EREBP_tnt.ERF104_col_a_m1.cb <==
>AP2EREBP_tnt.ERF104_col_a_m1
0.271095 0.172352 0.348294 0.208259
0.145422 0.569120 0.104129 0.181329
0.174147 0.554758 0.100539 0.170557
0.296230 0.109515 0.371634 0.222621
0.120287 0.526032 0.100539 0.253142
0.154399 0.651706 0.059246 0.134650
0.224417 0.089767 0.382406 0.303411
0.039497 0.662478 0.136445 0.161580
0.147217 0.833034 0.010772 0.008977
0.064632 0.000000 0.890485 0.044883
0.000000 0.996409 0.003591 0.000000
0.000000 1.000000 0.000000 0.000000
0.048474 0.000000 0.946140 0.005386
0.003591 0.836625 0.000000 0.159785
0.007181 0.992819 0.000000 0.000000
0.384201 0.000000 0.504488 0.111311
0.071813 0.491921 0.091562 0.344704
0.233393 0.391382 0.107720 0.267504
==> /home/msimenc/analysis/scrnaseq/scenic/create_cisTarget_database/Athaliana.Col-0.HPIv01_10k.promoters.fasta <==
>AT1G01030.Araport11.447
TCACTCACTTTGTTAAAAGAATAATTCAGTGTCTGGACACTAAAATCTTCCAAAAACCCC
ATATACATATATGCTATTTCGATACTTATATTTATTTACTCAGCATAAAAAATATTAACC
ATGTATTCATAGTAAAATGTTTCATGTGATATCAAACCAGCGACAACAAAAGTATTATTC
CCCTCATTATGTTTGACTCCTATTATATTTTTATTTTAATTTTTTTCACTATCATCTTTC
TTGCAATGAAAGTCCCATATATTGGTCAACATTTCAAACCACTTGTTCTCTTTTATGTTT
TGGTAAGAGCTATCTTCTAAATTTATAATACGCATAAATTCAAAAGTAAAAGAAAATTTT
GGTCATGAATGTTGTTTAAGTCATTTGGAGATACGAAATCAAATCTCCTTGTAGATTTTG
TTTTTAGAATGTCGTTCCTTTTTCATCATCTTAGCTATATCTACAGCTATATATCCTATC
TTTAAACCTATATTATTTTTTCCTCTCTTCACCAAAGCCATGTTTTTTAGTTGTGGCGAA
AAATAAGAAATCCATACATCAACATATCGCTTTCGTTACCTTAAATTTTGGCTTGTTATG
AAGGCATGTCATAACGTTTCTAGTCACAACTCACAAGCATACCAACGACCATGATAAATC
CAAAAAGTAGAAACAATCTATTATCTAAACCCCCAAAAGACAAAAGAAAAAAGTAGAAAG
AAAAGGTAGGCAGAGATATAATGCTGGTTTTATTTGTTTGTTAAAAGATATTGCTATTTC
TGCCAATATTAAAACTTCACTTAGGAAGACTTGAACCTACCACACGTTAGTGACTAATGA
GAGCCACTAGATAATTGCATGCATCCCACACTAGTACTAATTTTCTAGGGATATTAGAGT
TTTCTAATCACCTACTTCCTACTATGTGTATGTTATCTACTGGCGTGGATGCTTTTAAAG
ATGTTACGTTATTATTTTGTTCGGTTTGGAAAACGGCTCAATCGTTATGAGTTCGTAAGA
CACATACATTGTTCCATGATAAAATGCAACCCCACGAACCATTTGCGACAAGCAAAACAA
CATGGTCAAAATTAAAAGCTAACAATTAGCCAGCGATTCAAAAAGTCAACCTTCTAGATG
GATTTAACAACATATCGATAGGATTCAAGATTAAAAATAAGCACACTCTTATTAATGTTA
AAAAACGAATGAGATGAAAATATTTGGCGTGTTCACACACATAATCTAGAAGACAGATTC
GAGTTGCTCTCCTTTGTTTTGCTTTGGGAGGGACCCATTATTACCGCCCAGCAGCTTCCC
AGCCTTCCTTTATAAGGCTTAATTTATATTTATTTAAATTTTATATGTTCTTCTATTATA
ATACTAAAAGGGGAATACAAATTTCTACAGAGGATGATATTCAATCCACGGTTCACCCAA
ACCGATTTTATAAAATTTATTATTAAATCTTTTTTAATTGTTAAATTGGTTTAAATCTGA
ACTCTGTTTACTTACATTGATTAAAATTCTAAACCATCATAAGTAAAAAATAATATGATT
AAGACTAATAAATCTTAATAGTTAATACTACTCGGTTTACTACATGAAATTTCATACCAT
CAATTGTTTTAATAATCTTTAAAATTGTTAGGACCGGTAAAACCATACCAATTAAACCGG
AGATCCATATTAATTTAATTAAGAAAATAAAAATAAAAGGAATAAATTGTCTTATTTAAA
Do you have >seq_name
lines which are not followed by a sequence?
Also rescale your Cluster-Buster matrices to 100 as by default a pseudocount of 0.375 is added to each matrix element by Cluster-Buster.
Thanks for the tip about scaling the matrices.
Yes, the problem was a sequence header without any sequence! I used samtools faidx
to see the lengths of all sequences but it omitted the ones that were just headers. Thank you for your help!
When I run
it reports
for every motif.
I tried running the
cbust
command on its own but get:However, running this
cbust
command using-f 0
or-f 1
results in output.I opened an issue at the cluster buster repository:
https://github.com/weng-lab/cluster-buster/issues/5
Any help would be appreciated!