Closed fanhuan closed 8 months ago
The current behavior is that each time you run something like
randomSplit.pl train.gb 100
you get the same (pseudorandom) partition into two GenBank files.
At the very beginning of the script is an srand 4
command. If you comment that out or remove the line, you get different splits each time you run randomSplit.pl
.
Indeed this could be useful for cross-validation purposes.
Best, Mario
Hi there,
I was wondering whether there is a way to set seed for randomSplit.pl? I have a large training gene set and would like to run multiple runs on the same number of training genes (1000) to see whether it makes a difference. Or is this a silly idea and I should just use 80% of my set? Any advice would be appreciated.
Best, Huan