HobnobMancer / cazy_webscraper

Web scraper to retrieve protein data catalogued by the CAZy, UniProt, NCBI, GTDB and PDB websites/databases.
https://hobnobmancer.github.io/cazy_webscraper/
MIT License
12 stars 3 forks source link

[ERROR] [scraper.utilities.parse_configuration]: Could not open the CAZy synonym dictionary #36

Closed NaiveLittleTiger closed 3 years ago

NaiveLittleTiger commented 3 years ago

When I try to run python cazy_webscraper.py --classes GTs, and then return error '[scraper.utilities.parse_configuration]: Could not open the CAZy synonym dictionary', Then I go to diroctory /cazy_webscraper/scraper/utilities/parse_configuration, I am not sure this error means that I should modify init.py or cazy_dictionary.json, Can someone help me?

HobnobMancer commented 3 years ago

First note is that GTs will raise errors anyway, it won't be an accepted class - GT or gt will be.

The error is caused by raising a FileNotFoundError in line 290 in the script scraper.utilities.parse_configuration, which is raised because the script can't find the file. The JSON file is used to load in a dictionary of excepted synonyms of CAZy class names. In the /cazy_webscraper/scraper/utilities/parse_configuration dir do you see the cazy_dictionary.json file?

NaiveLittleTiger commented 3 years ago

Thank you very much. And I do have cazy_dictionary.json file image error like this picture, I can't find script scraper.utilities.parse_configuration image

HobnobMancer commented 3 years ago

If you update your local installation of cazy_webscraper, and it produces the same error message, the error message should now include the path cazy_webscraper was using to try and find the cazy_dictionary.json file. It looks like something is going wrong in the creation of that path.

Also, what do you mean by you can't find the script scraper.utilities.parse_configuration? scraper.utilities.parse_configuration is a module, which your directory explorer is pointing to in your screen shot. The functions are contained in the __init__.py, so you do have the 'script' scraper.utilities.parse_configuration.

What OS are you uisng? That is the most likely cause for there being an issue with building the path to the cazy_dictionary.json file.

NaiveLittleTiger commented 3 years ago

I am sorry that I still meet the same problem when I try to run python cazy_webscraper.py in Window10 and Centos 7. In Linux image However,cazy_dictionary.json does exit in this directory /cazy_webscraper/scraper/utilities/parse_configuration,I open the file cazy_dictionary.json to check it, and I don't find any problem in cazy_dictionary.json image image

I also try the other method to install cazy_webscraper, using conda install -c bioconda cazy_webscraper, PackageNotFoundError appeared, So I don't know what to do next

HobnobMancer commented 3 years ago

We can't guarantee cazy_webscraper will run on any other OS other than those listed under requirements in the README (MacOS and Linux).

Is the exact path in the error message the path to the file? It doesn't look correct to me.

cazy_webscraper is not yet integrated with bioconda so that installation method is unlikely to have worked, once version 1 is fully released we aim to intergrate it into bioconda then.

When installing cazy_webscraper did you use the -e flag with pip? Your command should look something like pip install -e <path_to_dir_containing_setup.py>. So if you are in the cazy_webscraper dir then the command would be pip install -e .. If you do not use the -e flag pip will not unpackage and install the tool as an executable from the dir within which it is located.

NaiveLittleTiger commented 3 years ago

Thank you for your help and patience. I just uninstall and git clone cazy_webscraper again, then I use pip install -e . Finally I can run the script cazy_webscraper.py image This might be changed , not python3 install -e .

HobnobMancer commented 3 years ago

No problem, and thanks for spotting the typo! The ReadTheDocs documentation is still in progress so I would recommend using the README as your primary source of info, then the ReadTheDocs doc as supplementary until the release of version 1.

For future reference, you didn't need to clone the repo again, just running pip3 install command would suffice. Pip would automatically over write the old install, and install the tool as a direct executable.