profusion / sgqlc

Simple GraphQL Client
https://sgqlc.readthedocs.io/
ISC License
506 stars 85 forks source link

json loading: open file in binary mode to support BOM #190

Closed andreikop closed 2 years ago

andreikop commented 2 years ago

Otherwise fails with trace

✗ sgqlc-codegen schema schema.json              
Writing to: schema.py
Traceback (most recent call last):
  File "/usr/local/bin/sgqlc-codegen", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/site-packages/sgqlc/codegen/__init__.py", line 130, in main
    args.func(args)
  File "/usr/local/lib/python3.10/site-packages/sgqlc/codegen/schema.py", line 676, in handle_command
    schema = load_schema(in_file)
  File "/usr/local/lib/python3.10/site-packages/sgqlc/codegen/schema.py", line 620, in load_schema
    schema = json.load(in_file)
  File "/usr/lib64/python3.10/json/__init__.py", line 293, in load
    return loads(fp.read(),
  File "/usr/lib64/python3.10/json/__init__.py", line 335, in loads
    raise JSONDecodeError("Unexpected UTF-8 BOM (decode using utf-8-sig)",
json.decoder.JSONDecodeError: Unexpected UTF-8 BOM (decode using utf-8-sig): line 1 column 1 (char 0)

when opening a UTF file which contains BOM

coveralls commented 2 years ago

Pull Request Test Coverage Report for Build 1863200937


Totals Coverage Status
Change from base Build 1859076545: 0.0%
Covered Lines: 1521
Relevant Lines: 1521

💛 - Coveralls
barbieri commented 2 years ago

@andreikop good call with the BOM, but I guess we need to convert it to string stating it's an UTF-8, otherwise parsers will break. At leasts using graphql-core==3.1.6 I get (weird enough the tests do not fail, it's reporting GitHub example Ok, I need to fix that one)

  File "/Users/gustavo/Development/git/sgqlc/.venv/lib/python3.7/site-packages/graphql/language/parser.py", line 103, in parse
    return parser.parse_document()
  File "/Users/gustavo/Development/git/sgqlc/.venv/lib/python3.7/site-packages/graphql/language/parser.py", line 199, in parse_document
    definitions=self.many(TokenKind.SOF, self.parse_definition, TokenKind.EOF),
  File "/Users/gustavo/Development/git/sgqlc/.venv/lib/python3.7/site-packages/graphql/language/parser.py", line 1088, in many
    self.expect_token(open_kind)
  File "/Users/gustavo/Development/git/sgqlc/.venv/lib/python3.7/site-packages/graphql/language/parser.py", line 982, in expect_token
    self._lexer.advance()
  File "/Users/gustavo/Development/git/sgqlc/.venv/lib/python3.7/site-packages/graphql/language/lexer.py", line 79, in advance
    token = self.token = self.lookahead()
  File "/Users/gustavo/Development/git/sgqlc/.venv/lib/python3.7/site-packages/graphql/language/lexer.py", line 88, in lookahead
    token.next = self.read_token(token)
  File "/Users/gustavo/Development/git/sgqlc/.venv/lib/python3.7/site-packages/graphql/language/lexer.py", line 109, in read_token
    if char in " \t,\ufeff":
TypeError: 'in <string>' requires string as left operand, not int
barbieri commented 2 years ago

pushed to master the fix to tests, now update-operations.sh will fail as expected. Please rebase your PR

barbieri commented 2 years ago

@andreikop seems the following patch fixes it. Could you check? If it really does, please include in your commit (this specific PR is best integrated as a single correct commit with all fixes incorporated, rebased to the latest master).

--- a/sgqlc/codegen/operation.py
+++ b/sgqlc/codegen/operation.py
@@ -937,7 +937,9 @@ def handle_command(parsed_args):
     in_files = args['operation.gql']
     short_names = args['short_names']

-    operations_gql = [Source(f.read(), f.name) for f in in_files]
+    operations_gql = [
+        Source(f.read().decode('utf-8'), f.name) for f in in_files
+    ]

     gen = CodeGen(
         schema, schema_name, operations_gql,
andreikop commented 2 years ago

@barbieri , I don't have time to test it now, but I've applied your fix because it looks perfectly reasonable. For json parser we open file in binary mode because it declares in the docs that binary utf-8 files are supported, when we need text - we explicitly decode it.

andreikop commented 2 years ago

Btw, we also can do open('r', encoding='utf-8'), whatever look better for you.

barbieri commented 2 years ago

if you can check with open('r', encoding='utf-8') with a BOM file and it works, it would be better as we know the rest of the software works as expected.

andreikop commented 2 years ago

@barbieri I've moved encoding to parameter declaration and tested this version.

barbieri commented 2 years ago

do we still need the rb? Couldn't we open all files using the encoding?

andreikop commented 2 years ago

@barbieri , the project crashes while trying to parse valid schema, and this PR discussion takes too much time. Just apply any changes which fix the crash.

barbieri commented 2 years ago

hi @andreikop , sorry to bother yet again. See if #193 works for you. I manually added the BOM to one of the schema.json, it failed before and not anymore after that patch.