koxudaxi / datamodel-code-generator

Pydantic model and dataclasses.dataclass generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.
https://koxudaxi.github.io/datamodel-code-generator/
MIT License
2.47k stars 280 forks source link

Show more precise cause of "Invalid file format" exception #566

Open deepaerial opened 2 years ago

deepaerial commented 2 years ago

Is your feature request related to a problem? Please describe.

I have RFC 8259 valid JSON (not jsonschema type) file with list of objects. When I try to run datamodel-codegen --input list_of_leads.json --input-file-type json --output leads_model.py command I get "Invalid file format" message. Interesting thing is that script works if list has one object in it. I think it fails because of some object in the list, but because list I'm feeding to datamodel-codegen has thousands of records I can't determine which one causes this error.

Describe the solution you'd like

I would like datamodel-codegen to show more detailed trace for "Invalid file format" exception. The best solution would be displaying whole JSON object with pointer to column or some error message explaining why this record can't be processed.

Describe alternatives you've considered For example tool can at least display column name or specific record count number that cannot be processed.

koxudaxi commented 2 years ago

@deepaerial I'm sorry for my late reply. Thank you for creating this issue. It's a great idea. But, It may be difficult to realize it 🤔 I will think about how to implement it.

ddanielgal commented 2 years ago

For people stumbling upon this issue in the search for what might have gone wrong (such as myself), in my case it was an encoding issue.

The json file I was trying to run through datamodel-codegen was the output of jq encoded as UTF-8, whereas the input to jq was us-ascii encoded. In my case, a simple -a option was enough for jq to keep the original encoding, and then the output went through datamodel-codegen without issue.

You can see a file's encoding with the file -i command.

wolever commented 1 year ago

I also ran into the Invalid file format issue, and it seems like it has something to do with using yaml as the input file format:

$ datamodel-codegen --output /tmp/x.py --input-file-type=yaml --input /tmp/x.yaml
Invalid file format

Converting the yaml to json seems to work, though:

import yaml
import json
json_obj = yaml.safe_load(open('/tmp/x.yaml').read())
open('/tmp/x.json', 'w', encoding='ascii').write(json.dumps(json_obj))

And then:

$ datamodel-codegen --output /tmp/x.py --input /tmp/x.json
… works …
koxudaxi commented 1 year ago

@deepaerial @wolever I'm sorry for my late reply. OK, I will improve the message and fallback action when the code-gen can't load the input file.

microspace commented 8 months ago

default encoding for input file is cp1251

dzmitry-lahoda commented 7 months ago

I have same message, different root cause. I have root object was not list, but object seems. had to jq into list.

peterlynch commented 6 months ago

I have same message. Perfectly valid openapi 3 yaml file had this content:

      example:
        field:
          - change: =
            content: status
        operator:
          - change: =
            content: less_than
        value:
          - change: +
            content: solved

In order to get the model to generate, I needed to enclose the equal signs in quotes.

      example:
        field:
          - change: '='
            content: status
        operator:
          - change: '='
            content: less_than
        value:
          - change: +
            content: solved