Closed npklein closed 7 years ago
@npklein Because the software did not find any motifs from your motif file, signaled by _data: []
. Where did you get the motifs? The motif file should be in MEME format. You can look at the examples here: https://github.com/kaizhang/Taiji/tree/master/docs/data/motifs
@kaizhang The motif had indeed some erroneous lines. Using the correct file tho I get
[WARN][07-15 20:15] Find_TF_sites: Failed! [ERROR][07-15 20:15] "Find_TF_sites" failed. The error was: Bio.Seq.Query.openGenome: Incorrect format CallStack (from HasCallStack): error, called at src/Bio/Seq/IO.hs:44:14 in bioinformatics-toolkit-0.3.2-6jGTx2VGtZyEsm7mUTIiFH:Bio.Seq.IO.
which I guess comes from https://github.com/kaizhang/bioinformatics-toolkit/blob/master/bioinformatics-toolkit/src/Bio/Seq/IO.hs where you check if magic = "<HASKELLBIOINFORMATICS_7d2c5gxhg934>"
is at the top an input file (I can't find which file you are reading here, guessing genome or genome index file from the variable names).
I'm not sure how this header would get in either of these files tho, I use the 1000G fasta genome reference and index it with samtools faidx
.
@npklein
################################################################################
# You don't have to physically provide the following files. But you do need to
# specify the locations where these files will be *GENERATED AUTOMATICALLY WHEN
# FILES/DIRECTORIES DOES NOT EXIST*. If the specified directories or files
# already exist, the program will do nothing.
# If this is the first time you run the program, make sure delete existing
# files/directories first so indices can be generated properly.
# You only need to generate the indices once, *THEY CAN BE REUSED*.
################################################################################
# This is the *FILE* containing GENOME SEQUENCE INDEX.
seqIndex: "/home/kai/genome/GRCh38/GRCh38.index"
You should not generate this file by yourself. Delete the file you have generated, re-run "Initialization". The program will generate the correct index for you.
D'oh! Was using a config without the comments and didn't think about that. I got my ranks now, thanks for all the help!
Hi @kaizhang, sorry I got another issue and haven't been able to solve it.
The Find_TF_sites does not search for me:
So I looked back at previous steps and seems _prepare also did not get the data
but the step ATAC_callpeaks does seem to have worked ( I can upload full file if needed)
And the .narrowPeak file does get written. I checked if I did not have the same problem as before that the chromosomes had chr in front of them, but this is not the case:
Any ideas where I went wrong?