doriansmiley / 10-kGPT

Analyze 10-Q and 10-K fillings with GPT
https://github.com/doriansmiley/10-kGPT
26 stars 7 forks source link

API Limitations #2

Closed mlau5689 closed 1 year ago

mlau5689 commented 1 year ago

Summary:

To enhance the quality of results and address usage limits and restrictions, a suggestion is to provide Contributors and Users with access to a shared-cost mediator service that facilitates interaction with costly APIs. The limitations of free API trials make it necessary to consider alternative approaches. One might consider alternatively using an open-source GPT instead. Currently, acquiring premium keys for the two APIs below costs $75 per month with most costs coming from Edgar (SEC).

ChatGPT:

To ensure that only relevant information is analyzed, it is recommended to clean the dataset before prompting. The pricing options for ChatGPT are as follows:

Requests:

Thoroughly cleaning the dataset before prompting is advised to focus only on relevant information. The request limits for different pricing options are as follows:

Tokens:

To effectively analyze the data, it is suggested to divide it into sections and chunk each section if necessary, considering the token request limit. The API response should be limited by the max_token parameter, and the chunked responses should be consolidated and summarized. The size of each summary should be restricted to a percentage of the request limit, with the weighting determined by section size relative to the entire document. By consolidating the section summaries, a comprehensive analysis of the entire document can be obtained. The token limits for different pricing options are as follows:

Context Loss and Image:

It's important to note that the more you chunk the data, the higher the risk of context loss. For context loss prevention and analysis accuracy, it is recommended to clean the dataset thoroughly before prompting.

SEC:

To ensure only relevant information is analyzed, it is advisable to clean the dataset thoroughly. The pricing options for SEC are as follows:

mlau5689 commented 1 year ago

Not really an 'issue'. Just noting things to consider. Especially given how crazy expensive Edgars' API is.

Overall functionality is definitely doable with something like openai+langchain+pinecone.