datacamp / sqlwhat

https://sqlwhat.readthedocs.io
GNU Affero General Public License v3.0
3 stars 8 forks source link

reiterate package structure #124

Open filipsch opened 6 years ago

filipsch commented 6 years ago

protowhat contains the functions that are shared between sqlwhat and shellwhat (and maybe pythonwhat at some point, see issue).

Protowhat contains a bunch of functionality:

For these functions, protowhat makes total sense and should be kept. however, there is also AST-related functionality in protowhat, such as selectors, dispatchers, functions like check_field and check_node. These were pulled into protowhat becuase it was expected that both sqlwhat and shellwhat would use them. In reality however, to my knowledge there isn't a single shell exercise that has an SCT that uses the AST representation of a bash command.

Pulling the AST-related functionality back into sqlwhat would make the package easier to understand, easier to update and easier to test. Up for debate. cc @machow

machow commented 6 years ago

I could see either way. My sense in terms of keeping in protowhat...

pros:

cons:

I would lean toward keeping in protowhat, since we have 3 libraries doing AST checking: python, R (using parser data, but close enough), sql. In the future, it would be cool to distill what we've learned from these 3 AST approaches into one place.

For example, pythonwhat uses a visitor pattern, identical to protowhat, but with a twist that the visitors pull out conceptually useful info for SCTs (e.g. tries to infer what function was imported from from pandas import DataFrame as DF). This task is pretty similar to trying to figure out what the alias of a SQL query refers to, but I think it would be a huge headache for a person if pythonwhat and sqlwhat both implemented their own ways of doing this. It seems useful to have a clear sense for how protowhat could replace the pythonwhat specific code.

An example of protowhat being used with python is here: https://github.com/datacamp/protowhat/blob/master/example/example_python.ipynb

On the other hand, if AST based checks are going to become less important (e.g. they require too much development time), then I definitely could see moving the code into sqlwhat.

filipsch commented 6 years ago

@machow thanks for your comments. I will not do any of this in the near future, I just wanted to put this issue out there for whoever takes over the maintenance of these packages.