Closed ewitt1093 closed 3 years ago
hi @ewitt1093, can you please post the full command you used? it seems to me that the options got mixed up. the script is looking for a motif called '>2L ...', which is the name of a chromosome.
the option -M should point to a directory with .cb files
-M /path/to/motif_dir/
ls /path/to/motif_dir/
jaspar__MA0150.2.cb
jaspar__MA0151.1.cb
jaspar__MA0152.1.cb
jaspar__MA0153.2.cb
and -m to a file containing a list of motif names. e.g.
-m motif_names.lst
cat motif_names.lst
jaspar__MA0150.2
jaspar__MA0151.1
jaspar__MA0152.1
jaspar__MA0153.2
Ah, I see the problem- my output from cbust is a single file. When you run cbust, how do you split the output to a single file for each motif instead of one big file?
When I ran cbust I used this command: cbust [file with matrices of JASPAR motifs from clusterbuster website] [dmel-all-chromosome-r6.15.fasta] >output.cb
hi @ewitt1093, you shouldn't have to run cbust yourself. create_cistarget_motif_databases.py does that for you. so the *.cb files contain the motifs in cbust format, not the cbust scores.
Oh, thanks so much for the clarification! My other question is: which command do I use to build a cistarget database with bigwig files?
unfortunately, this hasn't been implemented yet.
All right, I'll calculate motifs on my own first. Thanks for the help!
Hello, I am working through the instructions to create a custom cistarget database. Here's what files I have: Drosophila melanogaster whole-genome fasta fasta for genes/features of interest seperate bigwig files from encode for the TFs listed in https://resources.aertslab.org/cistarget/track2tf/encode_modERN_20190621__ChIP_seq.drosophila_melanogaster.dm6.track_to_tf_in_motif_to_tf_format.tsv The JASPAR motifs from https://zlab.bu.edu/clover/jaspar2005core cluster-buster output file for JASPAR motifs against Drosophila melanogaster whole-genome FASTA that looks like this: `>2L (23513712 bp)
CLUSTER 1 Location: 20704906 to 20706440 Score: 23.7 MA0073: 6.23 MA0015: 4.03 MA0074: 2.75 MA0052: 1.64 MA0096: 1.21 MA0082: 1.06 MA0043: 1.03 MA0068: 1.02 MA0025: 0.764`
I tried running create_cistarget_databases.py with -M pointing to the directory with the cbust output file, and -m with the path of the cbust output, and I get:
Error: Cluster-Buster motif filename "/rugpfs/fs0/zhao_lab/scratch/ewitt/witt/singlecell/clusterbuster/cluster-buster/>2L (23513712 bp).cb" does not exist for motif >2L (23513712 bp).
I tried using just the JASPAR motif matrix for -m, and I get Error: Cluster-Buster motif filename ">MA0100 c-MYB_1 TRP-CLUSTER.cb" does not exist for motif >MA0100 c-MYB_1 TRP-CLUSTER.Can you please help me understand how to properly format the inputs for this process? Thank you very much.