Closed guillaumecharbonnier closed 5 years ago
The problem is that the RAM footprint depends on the features. Even with batches of the same size, the memory cost will be different when processing 'exons' than when processing 'start_codon'.
Making a dynamic batch size would be possible, as the function to produce each batch runs independantly, but require some refactoring.
I feel we could replace these two technical arguments with a more user-friendly "--max-ram" argument with some benchmarking to find the appropriate extrapolation function depending on input files. Even better (?), the algorithm could :