Open ptynecki opened 5 years ago
Hi @ptynecki,
Does the example script test/test.sh
work ok for you?
@rmenegaux
Correct. The output from the test is below:
Training model fdna_k10_d10_e1
Read sequence n1, ENA|CP000473|CP000473.1|kraken:taxid|234267 ENA|CP000473|CP000473.1 Solibacter usitatus Ellin6076, complete genoRead sequence n2, ENA|CP001472|CP001472.1|kraken:taxid|240015 ENA|CP001472|CP001472.1 Acidobacterium capsulatum ATCC 51196, compleRead sequence n3, ENA|CP002542|CP002542.1|kraken:taxid|755732 ENA|CP002542|CP002542.1 Fluviicola taffensis DSM 16823, complete genRead sequence n4, ENA|CP004080|CP004080.1|kraken:taxid|1193806 ENA|CP004080|CP004080.1 Dehalococcoides mccartyi BTF08, complete geRead sequence n5, ENA|CP003947|CP003947.1|kraken:taxid|755178 ENA|CP003947|CP003947.1 Cyanobacterium aponinum PCC 10605, complete Read sequence n6, ENA|CP003856|CP003856.1|kraken:taxid|1100841 ENA|CP003856|CP003856.1 Acinetobacter baumannii TYTH-1, complete geRead sequence n7, ENA|CP004009|CP004009.1|kraken:taxid|1274814 ENA|CP004009|CP004009.1 Escherichia coli APEC O78, complete genome.Read sequence n8, ENA|AE017042|AE017042.1|kraken:taxid|229193 ENA|AE017042|AE017042.1 Yersinia pestis biovar Microtus str. 91001, Read sequence n9, ENA|CP001622|CP001622.1|kraken:taxid|395491 ENA|CP001622|CP001622.1 Rhizobium leguminosarum bv. trifolii WSM1325Read sequence n10, ENA|CP000975|CP000975.1|kraken:taxid|481448 ENA|CP000975|CP000975.1 Methylacidiphilum infernorum V4, complete genome.
Number of sequences 10
Number of labels: 10
Number of words: 1048576
Progress: 100.0% fragments/sec/thread: 16110 lr: 0.000000 loss: 2.027800 ETA: 0h 0m
Testing model fdna_k10_d10_e1
N 10000
P@1 0.197
R@1 0.197
Number of examples: 10000
Hmm... this error is really because the labels file is not opened, so I don't know why it is being thrown if the file is correct (right path, right permissions). Perhaps you could send it to me.
It should be unrelated but I get another error by running your command, because the k-mer length defaults to 0 (that is a bug I need to fix). With the command line interface you should give a minn
argument:
./fastdna supervised -input merged_training.fasta -labels labels.txt -output model -minn 4
Hey,
I wanted to use fastDNA but I have an issue which block me.
I prepared
merged_training.fasta
file which contains many FASTA samples from NCBI. I also createdlabels.txt
file with lables (one by one in new line).Input:
Output:
I'm sure that both files are correct with right chmod and permissions.