The output from using the default "detailed caption" prompt can be very long, is there any ways to control the output length without decreasing the accuracy?
I have tried a smaller value for max-len-b, but this will only truncate the result.
I have tried changing lenpen by passing it as an arg --lenpen 0.1 or setting it in generate_predictions() by cfg.generation.lenpen = 0.1, but the output is the same as before.
The output from using the default "detailed caption" prompt can be very long, is there any ways to control the output length without decreasing the accuracy?
I have tried a smaller value for
max-len-b
, but this will only truncate the result.