about issue #25 max_tokens parameter, you did not get the idea

I think you did a mistake in your last commit.

Setting the max_tokens to 4096 will output always an error like this: Screenshot_2023-03-08_14-47-16

I said for you to remove every mention on the code about max_tokens or tokens parameters so that it defaults to Inf (4096 - prompt tokens). Inf is not infinite, Inf equals to 4096 - prompt tokens, so that no one needs to keep changing the ata.toml to reduce or to increase the max_tokens limit. It will automatically make the calculation:

223509914-002b1be8-5e13-4020-b638-20bcc2de9e41

Who explained that to me was the awesome dev Jack Wong on a issue in his incredible app: https://github.com/jw-12138/davinci-web/issues/8

rikhuijzer / ata

about issue #25 max_tokens parameter, you did not get the idea #27