BioinformaticsLabAtMUN / Promotech

Machine-learning-based general bacterial promoter prediction tool.
GNU General Public License v3.0
42 stars 11 forks source link

Whole-Genome predicition is missing one decoding step #10

Closed pmjklemm closed 1 year ago

pmjklemm commented 1 year ago

I used a sequence with a known promotor sequence test2.fna:

>SMc01022_inside
GGAATACGTCGCCCCATAACGCTTTGCCGGCTGTCCGGCCAAATGGCGAATCGACCGGGCGAAAGCGAATAAAAACGAGGCCGCGGGCGTCTAATCGGAGACGCTCTACCATCACGTAGACGTACGCGCCGGGCGTCGGAGTCTGCCATGCTGCGCAAC

this contains the promotor sequence of SMc01022 (https://journals.asm.org/doi/pdf/10.1128/mSphereDirect.00454-18, Table 1, first row) I added 60nt up and downstream.

python promotech.py -pg -m RF-HOT -f test2.fna -o results

next predict:

python promotech.py -g -t 0.6 -i results -o results

the output prints one promotor (good) but hot-encoded:

(...)
    SEQ: 

0100010010001000000110000010010000010010010000100010001000101000000110001000001001000010000100010001010000100010010001000010000101000001001000100100010000100010 
    PREDICTION: 0.1105
(...)

please add a final decoding step. And results/genome_predictions.csv is empty.

Full output: predict.log.txt

pmjklemm commented 1 year ago

nevermind, the output is just empty with -t 0.6 my bad