jacobwilliams / json-fortran

A Modern Fortran JSON API
https://jacobwilliams.github.io/json-fortran/
Other
333 stars 82 forks source link

Question: dealing with duplicate named objects #548

Closed gha3mi closed 3 months ago

gha3mi commented 7 months ago

Hi @jacobwilliams,

I'm currently using json-fortran in the ForOpenAI project. I've encountered a situation where I receive a single string in the following format that contains multiple objects with the same name, 'data':

data: {
    "content": "1",
}

data: {
    "content": "2",
}

data: {
    "content": "3",
}

data: [DONE
]

Usually, when dealing with a single JSON object and without the object name 'data', json-fortran is easily usable:

{
    "content": "1",
}
call json%initialize()
call json%deserialize(string)
call json%get("content", content, found=found)
call json%destroy()

However, I am uncertain whether json-fortran supports extracting content from multiple 'data' objects in this format. Could you please provide guidance or suggestions on how to handle this scenario using json-fortran?

Thank you, Ali

borderite commented 3 months ago

Your data do not seem to be correctly coded. Try:

{
    "data": {
        "content": "1"
    },
    "data": {
        "content": "2"
    },
    "data": {
        "content": "3"
    },
    "data": ["DONE"]
}

json-fortran should be able to build its tree.

gha3mi commented 3 months ago

Thanks for your reply.

Your data do not seem to be correctly coded.

That is an example of a response from the OpenAI API.

borderite commented 3 months ago

While duplicated keys are not prohibited in JSON, you still need to quote keys and put commas to separate objects. I took a brief look at ForOpenAI's documentation, but I couldn't find a procedure that returns json-formatted result. Are you sure that ForOpenAI is supposed to give you JSON data?

gha3mi commented 3 months ago

There is never enough time to improve documentation;). The documentation is very basic, automatically generated using FORD.

If you have an OpenAI API key and want to test it:

  1. In the foropenai.json file, under the ChatCompletion section, set stream to true. You can find this setting at: foropenai.json Line 17.

  2. To check the response, add the following line of code print*, response%content after the existing code at: foropenai_ChatCompletion.f90 Line 877.

  3. Execute fpm run, and then initiate a chat with gpt from the terminal, for example "Hello. How are you?". Then you will be get a response in the format described here.

The problem is I can't modify the response from OpenAI. I have to somehow deserialize it. Now I don't even know if that response in the above format is a JSON standard or if json-fortran is unable to read it.

borderite commented 3 months ago

Sorry, I misunderstood your situation. I was not aware that you were the author of ForOpenAI. While I don't have time to try your work, I made a quick experiment with chatgpt. I visited chatgpt.com and entered a question:

Can you list good asian grocery markets in manhattan island. Give me your answer in the JSON format.

ChatGPT's answer was

{
"markets": [
{
"name": "H Mart",
"location": "38 W 32nd St, New York, NY 10001",
"specialty": "Wide variety of Korean groceries, including fresh produce, meats, and specialty items."
},
{
"name": "Hong Kong Supermarket",
"location": "157 Hester St, New York, NY 10013",
"specialty": "Chinese supermarket offering a diverse range of fresh produce, seafood, and Asian groceries."
},
{
"name": "Sunrise Mart",
"location": "4 Stuyvesant St, New York, NY 10003",
"specialty": "Japanese grocery store with a selection of fresh produce, snacks, and ingredients."
},
{
"name": "M2M Asian Grocery",
"location": "55 3rd Ave, New York, NY 10003",
"specialty": "Small Japanese and Korean grocery store with a variety of snacks, sauces, and staples."
},
{
"name": "Kalustyan's",
"location": "123 Lexington Ave, New York, NY 10016",
"specialty": "Specialty store offering a wide range of international foods, including Asian groceries, spices, and ingredients."
}
]
}

This output is syntactically valid.

Also, I found the following subsection in the chat section of the ChatGPT document. The italicized parts might be of some use for you.

response_format object

Optional An object specifying the format that the model must output. Compatible with GPT-4 Turbo and all GPT-3.5 Turbo models newer than gpt-3.5-turbo-1106.

_Setting to { "type": "jsonobject" } enables JSON mode, which guarantees the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.

gha3mi commented 3 months ago

Thank you for the testing! ForOpenAI works correctly for what you tested above.

As I mentioned earlier, the only issue is what I described here when the stream is set to true. Then, the response has the format of what I described in the first message.

borderite commented 3 months ago

Your foropenapi.json has no "response_format" in the "CharCompletion" section. I am wondering if it has something to do with the problem you described. (Well, I know very little about ChatGTP. :-) ) When I added "response_format": {"type": "json_object" } in the "CharCompletion" section and run foropenai, it abended saying that no api key was provided, which I provided through the environment variable.

By the way, json-fortran is very good at detecting syntactical errors in JSON data and giving informative error messages. It is probably next to impossibility that it deserializes malformed JSON data. You might want to resolve your issue with ChatGPT or its community.

Good luck!

jacobwilliams commented 3 months ago

So... I'm not sure what is being asked here.

The original question included text that was not valid JSON, so json-fortran is not going to be able to parse that. Not sure what else to say about that.