Closed marc-harary closed 3 years ago
I am not entirely sure what is the issue, as you did not include the full code to generate the final output. However, one possible issue is that in your code (i,j) is 0-indexed, while the label file provided in the SPOT-RNA PDB dataset is 1-indexed.
Also, you should change sequence = re.search("[AUCG]{2,}", outputStr).group()
to sequence = re.search("[ATUCG]{2,}", outputStr).group()
because apparent some (though just a small number) of the input sequence also has nucleotide type T
.
I have evaluated mxfold2 on the PDB dataset, but the performance has abysmal, especially in comparison with SPOT-RNA. I was wondering if something was wrong with the my code. Here is the function I have been using to call and mxfold2 and parse its output. Note that it prints in dot bracket format.