lorey / mlscraper

🤖 Scrape data from HTML websites automatically by just providing examples
https://pypi.org/project/mlscraper/
1.31k stars 89 forks source link

训练出现错误 #36

Closed liliwen365 closed 1 year ago

liliwen365 commented 1 year ago

网页中有这个5190,是不是因为网页提供的数据存在空格或空行?这种情况如何解决,能不能忽略空行只提取数据? 我的代码: einstein_url = 'http://www.i001.com/main1.shtml' resp = requests.get(einstein_url) assert resp.status_code == 200 training_set = TrainingSet() page = Page(resp.content) sample = Sample(page, {'name': '5190'}) training_set.add_sample(sample) scraper = train_scraper(training_set)

反馈错误: ValueError:

                                    5190
                                </td> is not in list
lorey commented 1 year ago

Meine Muttersprache ist Deutsch. Sollen wir uns in der Mitte treffen und Englisch sprechen?