Gaius-Augustus / Augustus

Genome annotation with AUGUSTUS
http://bioinf.uni-greifswald.de/webaugustus/
289 stars 110 forks source link

setting seed for randomSplit.pl #408

Closed fanhuan closed 8 months ago

fanhuan commented 8 months ago

Hi there,

I was wondering whether there is a way to set seed for randomSplit.pl? I have a large training gene set and would like to run multiple runs on the same number of training genes (1000) to see whether it makes a difference. Or is this a silly idea and I should just use 80% of my set? Any advice would be appreciated.

Best, Huan

MarioStanke commented 8 months ago

The current behavior is that each time you run something like

randomSplit.pl train.gb 100

you get the same (pseudorandom) partition into two GenBank files. At the very beginning of the script is an srand 4 command. If you comment that out or remove the line, you get different splits each time you run randomSplit.pl. Indeed this could be useful for cross-validation purposes.

Best, Mario