Closed npklein closed 7 years ago
@npklein It is absolutely possible to rerun only certain steps. But this feature hasn't been documented. To rerun selected steps:
taiji view
. The output is DOT code. You can use dot
program to convert it to a graph or just open a text editor to view it. This is an example for the latest version of taiji
: https://github.com/kaizhang/Taiji/blob/master/Taiji.png.taiji rm [STEP_NAME]
. For example, taiji rm RNA_alignment_prepare
. In your case, you need to delete all steps related to RNA-seq.taiji run --config config.yml --select RNA_alignment_prepare,RNA_alignment
. This will only execute the specified steps and their dependencies.Back to your problem, the program did not analyze the RNA-seq data because BAM file is currently not supported for RNA-seq analysis (see this table). You have two options:
If you decide to change input, you need to rerun "Initialization": taiji rm Initialization && taiji run --config config.yml --select Initialization
.
@kaizhang That's great, thanks
@kaizhang I updated my input.yml to use RNAseq instead, and uncommented the STAR config lines in config.yml. I removed the RNA steps with
./taiji-Linux-x86_64-static rm Initialization ./taiji-Linux-x86_64-static rm Output_network ./taiji-Linux-x86_64-static rm PageRank ./taiji-Linux-x86_64-static rm RNA_average ./taiji-Linux-x86_64-static rm RNA_average_prepare ./taiji-Linux-x86_64-static rm RNA_convert_ID_to_name ./taiji-Linux-x86_64-static rm RNA_quantification ./taiji-Linux-x86_64-static rm RNA_alignment ./taiji-Linux-x86_64-static rm RNA_alignment_prepare ./taiji-Linux-x86_64-static rm "Get RNA-seq data"
Still, the RNA steps only take a few seconds to run (see below, log of steps is all in same minute), and the RNA_seq folder in output/ is empty. Is there a --debug type of option that shows which commands it is running at each step?
Also, when I try to run single steps, e.g. Initialization, I get an error
[umcg-ndeklein@calculon 18:23:15 TCC_clones_DHSseq]$ ./taiji-Linux-x86_64-static run --config config.yml --select Initialization Invalid option `--select'
Usage: taiji-Linux-x86_64-static COMMAND
step log
[LOG][07-05 18:20] Initialization: running... Sequence index exists. Skipped. BWA index exists. Skipped. STAR index directory exists. Skipped. RSEM index directory exists. Skipped. [LOG][07-05 18:20] Initialization: Finished. [LOG][07-05 18:20] RNA_alignment_prepare: running... [LOG][07-05 18:20] RNA_alignment_prepare: Finished. [LOG][07-05 18:20] RNA_alignment: running... [LOG][07-05 18:20] RNA_alignment: Finished. [LOG][07-05 18:20] RNA_quantification: running... [LOG][07-05 18:20] RNA_quantification: Finished. [LOG][07-05 18:20] RNA_convert_ID_to_name: running... [LOG][07-05 18:20] RNA_convert_ID_to_name: Finished. [LOG][07-05 18:20] RNA_average_prepare: running... [LOG][07-05 18:20] RNA_average_prepare: Finished. [LOG][07-05 18:20] RNA_average: running... [LOG][07-05 18:20] RNA_average: Finished. [LOG][07-05 18:20] Output_network: running... [LOG][07-05 18:20] Output_network: Finished. [LOG][07-05 18:20] PageRank: running... Running PageRank... [LOG][07-05 18:20] PageRank: Finished.
@npklein Sorry I forgot that the version you used probably do not have the --select
option.
You had a mis-typing in your last command -- ./taiji-Linux-x86_64-static rm "Get RNA-seq data
.
It should be "Get_RNA_data". If you look at the log, you will find the step "Get_RNA_data" was not executed, so the data wasn't updated. Sorry that the program currently won't warn you when there is no such entry in the database.
There is another command which let you see the cache in the database. If you type taiji cat Get_RNA_data
, you can see the input it captured. In your case, it should be empty.
I assume the problem was solved.
@kaizhang Yes I got it to work to recognize my RNA samples, thanks. I got another problem now but will open a new issue.
Hi @kaizhang, after fixing the BAM chromosoom names the rest of the pipeline ran without an error, however the Network and Rank files are empty. The RNA_Seq output dir is also empty, so I'm thinking it might be because the input.yml RNAseq part is not done correct, but it looks like the example input file: https://gist.github.com/npklein/dd3acaad067fbf96ba03e46dd7d97c9a.
Also, in the logs it does say that it is doing something with the RNAseq
Also, is there a way to rerun only part of the pipeline? I can remove
sciflow.db
, but then it also reruns ATAC-seq MarkDuplicates and peak calling, which takes quite long. Would like to test RNAseq only part of the pipeline (if that is the problem)