HobnobMancer / cazy_webscraper

Web scraper to retrieve protein data catalogued by the CAZy, UniProt, NCBI, GTDB and PDB websites/databases.
https://hobnobmancer.github.io/cazy_webscraper/
MIT License
12 stars 3 forks source link

`--families` option does not work as stated in tutorial #39

Closed widdowquinn closed 3 years ago

widdowquinn commented 3 years ago

Describe the bug

The tutorial at []() suggests the following command should download a subset of CAZy families:

cazy_webscraper.py --families GH2,PL5,CE1,CE2,AA10

but it produces an error:

$ cazy_webscraper.py --families GH2,PL5,CE1,CE2,AA10
usage: cazy_webscraper.py [-h] [-c config file] [-d {None,class,family}] [-f] [-g Email address of user] [-l log file name] [-n] [-o output file name]
                          [-genbank_output output file name] [-pdb_output output file name] [-p {None,mmCif,pdb,xml,mmtf,bundle}] [-s] [-v]
cazy_webscraper.py: error: unrecognized arguments: --families GH2,PL5,CE1,CE2,AA10

To Reproduce

Expected behavior

CAZy family data is downloaded

HobnobMancer commented 3 years ago

I can't reproduce that error.

I tried a fresh install from conda: conda install cazy_webscraper. Then called cazy_webscraper.py --families GH2,PL5,CE1,CE2,AA10, and it is only scrapping the families specified:

$ cazy_webscraper.py --families GH2,PL5,CE1,CE2,AA10
[WARNING] [scraper.utilities.parse_configuration]: Using default CAZy class synonyms
Parsing CAZy classes:   0%|                                                                                                                 | 0/4 [00:00<?, ?it/s]
Parsing Glycoside Hydrolases (GHs) families:   0%|                                                                                        | 0/172 [00:00<?, ?it/s]
Parsing protein pages for GH2:   1%|█                                                                                         | 293/25682 [00:35<30:48, 13.73it/s]

It looks like it's calling the wrong parser. It looks like it's calling the cmd-line args parser for expand.get_pdb_structures. Can't currently understand why

widdowquinn commented 3 years ago

Thanks for checking!

This might have stemmed from a failure to overwrite the .egg-info files (see #43) in my local development folder. It seems to work for me, now I've fixed it (see also #44).

I'm happy to close this.