profusion / sgqlc

Simple GraphQL Client
https://sgqlc.readthedocs.io/
ISC License
512 stars 85 forks source link

Mapping of JSON response on Operation isn't done completely #164

Open hofmatthias opened 3 years ago

hofmatthias commented 3 years ago

In the schema we use has quite a lot of nested objects

ex. query { devices { id name inputs { id type inputSettings { voltage temperature resolutionSettings { ... } } } } }

If we query all devices we get a valid json representation, thanks to the __to_graphql(depth=x) functionality. But when we map this json_data on the operation class we encountered the issue that the most nested data isn't mapped. data = op + json_data

When debugging and going over it I noticed that all the data is mapped correctly. When looking at some imports used I see that there is some form of multithreading used. My suspicion is that there is some race condition or something when there is a lot of nested data that needs to be loaded.

In above example the returned data would be something like this: [Device]

=> Without the resolutionSettings data in it, but this data is present in the json_data response

barbieri commented 3 years ago

sgqlc doesn't do any multithread on its own, or you mean on your side? But I bet that's not the issue.

I don't have time to dig into this now, but it would help me immensely if you can try to provide me an example (you can modify the existing examples, like the GitHub or Shopify).

But as an advice, the depth=x and auto-selection is only meant to help quick prototyping, it's not the GraphQL way of doing things and you should avoid that in real code. With GraphQL (and thus sgqlc), the recommendation is to only query the fields that you need. This becomes more clearly when you start to have API changes (like new APIs), but also may save lots of resources on servers (not all fields are "cheap", some may be computed or fetched from different tables, which results in extra load).

hofmatthias commented 3 years ago

Yeah, I know it's against the GraphQL best practices to request a lot at once. In our company it's an internal software tool that exposes a GraphQL API, and we need to use it for system testing. In this testing use case it useful to have all relevant data available from the beginning.

I'm going to try to solve it by separating the 1 query in multiple ones. But even when requesting 1 level less, I still have the same issue.

I'm not using multi threading. But if you follow the import chain of some packages you end up in the reprlib package which imports the _thread package. That's why I thought it could be a cause of the issue. Because I had similar issues in other projects with multi threading.

Attached you can find the query used video_sources.graphql.txt All data is mapped, except for the VideoSignal field.

I'm afraid that I don't have the time to investigate this issue on a different example.

Regarding attached query. I only get the "VideoSignal" field back when I request "VideoSourceSettings" from 1 specific video source. video_source_settings.txt

hofmatthias commented 3 years ago

I narrowed the issue down to the following: image In the add method of the Operation class.

o = self.type(other.get('data'), self.selection_list) => self.selection_list doesn't contain all selections If I print out self.__selection_list I get the following: selection_list.txt Not containing the VideoSignal field (that could also be because of the str or repr__ method because if I go into it with the debugger that field is present in the list)

hofmatthias commented 3 years ago

I found a very easy solution that fixed the problem. image

By instantiating the objects of the schema myself by passing the json data it fixes the problem. Thus not using the + operator anymore.

dbrumley commented 2 years ago

I think I ran into this issue as well when iterating over github forks. Example included.

The type of owners on repo.forks nodes changes after addition. Small example (this is line 143 and 146 of the code):

{'__typename': 'Repository', 'url': 'https://github.com/ivg/bap', 'owner': {'__typename': 'User', 'url': 'https://github.com/ivg', 'id': 'MDQ6VXNlcjIzMzY2OTg=', 'login': 'ivg', 'company': 'Carnegie Mellon University,  Cylab', 'email': 'ivg@ieee.org', 'twitterUsername': 'ivg_t'}}
> /Users/dbrumley/git/github/dbrumley/mayhem-heroes-cli/mayhem_heroes/graphql4.py(145)query_repos()
-> repo.forks += page_data.forks
(Pdb) n
> /Users/dbrumley/git/github/dbrumley/mayhem-heroes-cli/mayhem_heroes/graphql4.py(146)query_repos()
-> print(d['data']['repository']['forks']['nodes'][0])
(Pdb) 
{'url': 'https://github.com/ivg/bap', 'owner': {'id': 'MDQ6VXNlcjIzMzY2OTg=', 'login': 'ivg', 'url': 'https://github.com/ivg', 'email': 'ivg@ieee.org', 'company': 'Carnegie Mellon University,  Cylab', 'twitterUsername': 'ivg_t'}}

Interestingly, this does not happen with stargazers or watchers, which are also nodes.

One potential difference: the owner of a fork uses sum types, as it could be either a User or an Organization. stargazers and watchers are always users. Perhaps the bug has to do with sum types in queries? The type of repo.forks.__json_data__ shows that the type of the owner is changed from User to nothing for some reason.

(Pdb) p repo.forks.__json_data__
{'__typename': 'RepositoryConnection', 'totalCount': 246, 'pageInfo': {'hasNextPage': True, 'endCursor': 'Y3Vyc29yOnYyOpHOAY-mHQ=='}, 'nodes': [{'__typename': 'Repository', 'url': 'https://github.com/ivg/bap', 'owner': {'__typename': 'User', 'url': 'https://github.com/ivg', 'id': 'MDQ6VXNlcjIzMzY2OTg=', 'login': 'ivg', 'company': 'Carnegie Mellon University,  Cylab', 'email': 'ivg@ieee.org', 'twitterUsername': 'ivg_t'}}]}
(Pdb) n
> /Users/dbrumley/git/github/dbrumley/mayhem-heroes-cli/mayhem_heroes/graphql4.py(146)query_repos()
-> print(d['data']['repository']['forks']['nodes'][0])
(Pdb) p repo.forks.__json_data__
{'__typename': 'RepositoryConnection', 'totalCount': 246, 'pageInfo': {'hasNextPage': True, 'endCursor': 'Y3Vyc29yOnYyOpHOAZBysA=='}, 'nodes': [{'url': 'https://github.com/ivg/bap', 'owner': {'id': 'MDQ6VXNlcjIzMzY2OTg=', 'login': 'ivg', 'url': 'https://github.com/ivg', 'email': 'ivg@ieee.org', 'company': 'Carnegie Mellon University,  Cylab', 'twitterUsername': 'ivg_t'}}, {'url': 'https://github.com/maverickwoo/bap', 'owner': {'id': 'MDQ6VXNlcjE3MjY5Mw==', 'login': 'maverickwoo', 'url': 'https://github.com/maverickwoo', 'email': '', 'company': None, 'twitterUsername': None}}]}

graphql4.txt