HobnobMancer / cazy_webscraper

Web scraper to retrieve protein data catalogued by the CAZy, UniProt, NCBI, GTDB and PDB websites/databases.
https://hobnobmancer.github.io/cazy_webscraper/
MIT License
12 stars 3 forks source link

'cazy_webscraper.py' is in 'scraper' directory #46

Closed LFLWilson closed 3 years ago

LFLWilson commented 3 years ago

Thanks for this great tool! I just downloaded it and found I had to move cazy_webscraper.py out of the scraper package folder and into the main directory for the script to run.

HobnobMancer commented 3 years ago

Hi! Thanks for using cazy_webscraper and glad you’re finding it useful!

That seems like a weird behaviour, thanks for flagging it. How did you install cazy_webscraper? Also, what error message and traceback specifically were you getting when trying to run cazy_webscraper, before you moved the script cazy_webscraper.py?

LFLWilson commented 3 years ago

Perhaps I haven't installed it correctly, I just downloaded everything, extracted, and moved into the cazy-webscraper-master folder. I couldn't see cazy_webscraper.py there, but I found it in scraper, so I ran from the terminal: python3 scraper/cazy_webscraper.py --families GT1 That gave me the following error:

Traceback (most recent call last):
  File "scraper/cazy_webscraper.py", line 76, in <module>
    from scraper import crawler
ModuleNotFoundError: No module named 'scraper'
LFLWilson commented 3 years ago

Ah, so sorry, I returned to original state and ran: pip3 install -e . and it works now! Oops. Sorry to bother. Thanks anyway!

HobnobMancer commented 3 years ago

Annoyingly downloading the files doesn't install the package. There's a couple of extra steps to install cazy_webscraper I'm afraid

The README contains a breif summary of each of the methods of installation (of which there are 3).

For a step-by-step walk through that provides more explanation there is an installation tutorial in the documentation, which explains in depth how to get set up on MacOS, Windows and Linux systems.

For the quickest and easiest install then I would recommend using conda (if you have conda installed), then you need only run the command: conda install -c bioconda cazy_webscraper and it handles everything. After that, to use cazy_webscraper from any directory you need only call it using cazy_webscraper followed by any optional flags.

Alternatively, you can use pip (which you'll very likely have installed if you have Python installed), and then you can use the command pip3 install cazy_webscraper. Again, to use cazy_webscraper from any directory you need only call it using cazy_webscraper followed by any optional flags.

The advantage of using pip and conda is that they will install cazy_webscraper and all the packages it requires to operate, and build the necessary paths between its modules and submodules.

If you want to continue on from where you have gotten to, you need to move the script cazy_webscraper.py back into the scraper directory. Then with the terminal pointed out the main (root) directory using setuptools to install cazy_webscraper. This can be done using the following command: pip3 install -e . -- don't forget the -e flag otherwise the tool won't be installed as an executable and you'll run into no ends of problems! Once that is done, to then use cazy_webscraper, you can call it by using Python3 <path to cazy_webscraper.py> <optional flags>

HobnobMancer commented 3 years ago

Oh no worries that was bad timing on my part! Glad its all working now :) -- it was also no both at all!

If you have any other issues, be it bugs or unsure which command flags/options to use please raise an issue again