Closed J-Calvelo closed 8 months ago
Hi,
SignalP 6 is essentially a model that predicts two things at the same time: a) the type of the signal peptide b) the region structure and the cleavage site, i.e. the label at each sequence position.
Region structures are different for each type. Sometimes it happens that the model predicts a type, but a region structure of a different type. (E.g. it predicts a Sec/SPI signal peptide, but a region structure of type "No SP" in the eukarya case). The model is a neural network, so it is hard to tell when and why exactly it happens. In most of the cases our prediction post-processing handles it, but some still get missed. We are still working out a fix that completely prevents it from happening.
For now, the tool gives you the warning (previous versions just crashed).
From a user perspective, you can always trust the type predictions, the warning does not matter there. If you also want to use the region structure and cleavage sites, I recommend you manually look at the probability curves of the affected sequences to see if they look ok.
Hope this helps for now until we have a fix.
The numbers refer to the original file, yes. But they are in Python format, meaning the first sequence in the file is 0
, the 2nd 1
and so on. Can be confusing, I guess I should also change that in the next update.
It does, thanks!
Hello, I got the following warning messages running signalp6 with "--organism euk " and got these warnings:
Are they sequence numbers on the original file? If so these are the sequences:
What is the cause? Thanks