scrapinghub / portia2code

BSD 3-Clause "New" or "Revised" License
49 stars 25 forks source link

TypeError: 'set' object does not support indexing #7

Open wenhel opened 7 years ago

wenhel commented 7 years ago

problem

This problem happened when I wanted to use the download as scrapy on the scrapinghub website and also happened when using a command line, like: portia_porter test01 test01x.

environment

The vitural machine I used is build by official docker-image of portia + portia2code(0.0.12)+slybot (0.13.1, /app/slybot)

the portia zip code

The zip file of a sample of my portia-project is here, it is created on portia web-ui by download as portia.

tested websites

This error happens every time when I build projects and spiders for websites such as amazon, homedepot, etc. But not to reddit.

the error message

The error message is as follows:

/app/slybot/slybot/plugins/scrapely_annotations/builder.py:366: ScrapyDeprecationWarning: Attribute `_root` is deprecated, use `root` instead
  elems = [elem._root for elem in self.selector.css(selector)]
2017-08-16 03:55:16 INFO: Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt
2017-08-16 03:55:16 INFO: Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
2017-08-16 03:55:16 INFO: Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
2017-08-16 03:55:16 INFO: Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
2017-08-16 03:55:16 INFO: Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
2017-08-16 03:55:16 INFO: Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
2017-08-16 03:55:16 INFO: Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
2017-08-16 03:55:16 INFO: Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
2017-08-16 03:55:16 INFO: Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
2017-08-16 03:55:16 INFO: Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
2017-08-16 03:55:16 INFO: Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
2017-08-16 03:55:16 INFO: Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
2017-08-16 03:55:16 INFO: Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
2017-08-16 03:55:16 INFO: Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
2017-08-16 03:55:16 INFO: Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
2017-08-16 03:55:16 INFO: Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
2017-08-16 03:55:16 INFO: Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
2017-08-16 03:55:16 INFO: Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
2017-08-16 03:55:16 INFO: Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
2017-08-16 03:55:16 INFO: Creating spider "www.amazon.com"
Traceback (most recent call last):
  File "/usr/local/bin/portia_porter", line 39, in <module>
    project_zip = port_project(dir_name, schemas, spiders, extractors).read()
  File "/usr/local/lib/python2.7/dist-packages/portia2code/porter.py", line 273, in port_project
    spider_data = create_spiders(spiders, schemas, extractors, schema_names)
  File "/usr/local/lib/python2.7/dist-packages/portia2code/porter.py", line 239, in create_spiders
    spider = create_spider(name, spider, spec, schemas, extractors, items)
  File "/usr/local/lib/python2.7/dist-packages/portia2code/porter.py", line 221, in create_spider
    spider.plugins[0].extractors
  File "/usr/local/lib/python2.7/dist-packages/portia2code/samples.py", line 27, in extract
    items.extend(self.container(extractor, None) or [])
  File "/usr/local/lib/python2.7/dist-packages/portia2code/samples.py", line 39, in container
    return self.container(extractors[0], schema_id)
  File "/usr/local/lib/python2.7/dist-packages/portia2code/samples.py", line 51, in container
    self.items.get(schema_id, self.default_item))
  File "/usr/local/lib/python2.7/dist-packages/portia2code/utils.py", line 133, in container_to_item
    fields)
  File "/usr/local/lib/python2.7/dist-packages/portia2code/utils.py", line 153, in build_repeating_items
    elif sels and '::attr' not in sels[0] and '::text' not in sel[0]:
TypeError: 'set' object does not support indexing
wenhel commented 7 years ago

Need to mention that, the problem happens even when I use the official online version of Portia. However, I cannot get the backend error message. I just got a blank web said A server error occurred. Please contact the administrator.