biocore / redbiom

Sample search by metadata and features
Other
44 stars 20 forks source link

Searching commands are incompatible with python 3.8+ #94

Closed ryanc16 closed 4 years ago

ryanc16 commented 4 years ago

Describe the Issue Performing any of the searching functions (and possibly others), results in a stack trace being output in the console. After minimal investigation, I believe this is due to changes in the ast (Abstract Syntax Tree) package included with python for parsing/compiling the users query terms done in the where_expr.py. It appears this is an issue with version of python 3.8 and newer, as I later tried installing an older version of python (3.7.7) and did not encounter the issue.

Steps to Reproduce Following the README found on the projects GitHub page, performing suggested commands such as:
redbiom search metadata "soil & europe where ph < 7"
results in the following error:

Traceback (most recent call last):
  File "/home/ubuntu-user/anaconda3/envs/python3.8/bin/redbiom", line 10, in <module>
    sys.exit(cli())
  File "/home/ubuntu-user/anaconda3/envs/python3.8/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/ubuntu-user/anaconda3/envs/python3.8/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/ubuntu-user/anaconda3/envs/python3.8/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ubuntu-user/anaconda3/envs/python3.8/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ubuntu-user/anaconda3/envs/python3.8/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ubuntu-user/anaconda3/envs/python3.8/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/ubuntu-user/anaconda3/envs/python3.8/lib/python3.8/site-packages/redbiom/commands/search.py", line 120, in search_metadata
    for i in redbiom.search.metadata_full(query, categories):
  File "/home/ubuntu-user/anaconda3/envs/python3.8/lib/python3.8/site-packages/redbiom/search.py", line 60, in metadata_full
    obs = set(redbiom.where_expr.whereeval(q, get=get).index)
  File "/home/ubuntu-user/anaconda3/envs/python3.8/lib/python3.8/site-packages/redbiom/where_expr.py", line 178, in whereeval
    result = eval(ast.dump(formed))
  File "<string>", line 1, in <module>
NameError: name 'Constant' is not defined

Expected Behavior

Versions

Additional Context This was performed on a fresh install of Ubuntu Desktop, which apparently now installs python 3.8 by default. After many attempts to get this installed and working with built in python, decided to try using anaconda to install and use on older versions of python. Notice in the code block in the steps to reproduce, running in a conda environment using python 3.8 also results in the error.

Screenshots N/A

wasade commented 4 years ago

That is interesting, thank you @ryanc16 for flagging and for the detailed report and investigation. This is really helpful. Would you be interested in contributing a pull request by chance?

From a quick review of the travis config, we're not currently testing against py38 but clearly it would be valuable to do so. Most users I suspect are still on py36 as that's where QIIME 2 currently is.

ryanc16 commented 4 years ago

I did some more investigation and came across Issue32892 on the python bug tracker site and this PR on GitHub. In short, the discussion is about the changes to the AST library to change Expressions created using ast.parse from using Num, Str, Bytes, Ellipsis, and NameConstant to using a single node Constant. ast.parse is used in the where_expr.py file. The conversation points out many open source projects in the wild which have been affected by this change.

Running it through a debugger I can confirm this is what is happening in this case as well. Using search metadata "soil where ph > 8", which is a similar but simplified command as mentioned in the original issue write-up, here is the results of the ast.dump(formed) on line 178 in where_expr.py:
python 3.7.7 python37 and python 3.8.3 python38

Notice the replacement of Num(n=8) with Constant(value=8, kind=None) which coincides with the linked issue and now makes sense in the context of the original error that stated:

NameError: name 'Constant' is not defined

Why that makes it not work, I have no idea. I would think it should know about the ast.Constant class. It doesn't actually fail until it attempts to use eval on the ast.dump result.

@wasade I am open to contributing, however my python experience is limited. I'm not really seeing anything about this, let alone an easy way to fix it, by googling or checking stack overflow. I can work with it some more to try to get something that works. It's possible there might be something that can be done with the node_types and ast.walk section of code.

At a minimum, it might be a good idea to update the README to mention this project suggests the use of python 3.5 or 3.6, given that it only tests against those versions. That would have been helpful for me when initially trying to get it installed and working.

wasade commented 4 years ago

Hi @ryanc16, sorry for the delayed response. This is excellent investigative work. I agree that it would make sense to limit the project to < 3.8 for the time being until a fix can be put in here. To be honest, this was my first time using the ast module in Python in well over 10 years, and the use is definitely rough around the corners.

ryanc16 commented 4 years ago

After spending some more time looking through the file, I realized what the problem actually was, based on the error message that was given. After seeing that Num and Str are redefined in that file (among other "valid" nodes), it became clear that the error wasn't talking about not being able to find ast.Constant, it was talking about it not being defined in the file when attempting to eval.