lorey / mlscraper

🤖 Scrape data from HTML websites automatically by just providing examples
https://pypi.org/project/mlscraper/
1.31k stars 89 forks source link

missing mlscraper.html #29

Closed appsec-airito closed 2 years ago

appsec-airito commented 2 years ago

Followed the readme and was testing the code after pip install --pre mlscraper

But got a module not found error

from mlscraper.html import Page ModuleNotFoundError: No module named 'mlscraper.html'

checking the installed library, only the following were present: ml.py parser.py training.py util.py

For people checking out to library it will be convenient if we add all dependencies in readme present: from mlscraper.html import Page from mlscraper.samples import Sample, TrainingSet from mlscraper.training import train_scraper

appsec-airito commented 2 years ago

even after adding the modules there are more errors popping up. Wondering if, I'm missing something. Will wait before debugging more.

lorey commented 2 years ago

Hi. Thanks for getting in touch.

No sure why, but this sounds like an old version still. The files exactly match version 0.1.2 as seen here: https://github.com/lorey/mlscraper/tree/68e59a948cd2e448bdd79240456ce447b1cf04db/mlscraper

If you want to be 100% sure, you can check the output of pip freeze, or print(mlscraper.__version__) or look into mlscraper.__init__.py for the __version__ variable.

I would assume pip install used the old version due to a missing dependency, most likely python 3.9+?

lorey commented 2 years ago

I assume trying to force pip to install the latest version will most likely point you straight to the issue at hand: pip install --pre 'mlscraper==1.0.0rc3'

appsec-airito commented 2 years ago

Yes, it's a version problem. Found it after i tried:

pip install git+https://github.com/lorey/mlscraper#egg=mlscraper

it said that python version >=3.9 is needed. I was trying with python 3.8 since the repo said it is compatible with only upto 3.8

it works perfectly fine in python version 3.9.14

lorey commented 2 years ago

Happy to hear that.

appsec-airito commented 2 years ago

thanks for the quick reply