ggaughan / pipe2py

A project to compile Yahoo! Pipes into Python (see it hosted on Google App Engine: http://pipes-engine.appspot.com)
http://wiki.github.com/ggaughan/pipe2py
GNU General Public License v2.0
317 stars 51 forks source link

union has no conf and therefore can't be iterated #9

Open gorenje opened 13 years ago

gorenje commented 13 years ago

Hi,

this is the error i get:

Traceback (most recent call last):
  File "testbasics.py", line 495, in test_fetchpage
    p = pipe2py.compile.parse_and_build_pipe(self.context, pipe_def)
  File "..../pipe2py/compile.py", line 298, in parse_and_build_pipe
    pb = build_pipe(context, pipe)
  File "..../pipe2py/compile.py", line 113, in build_pipe
    if 'prompt' in module['conf'] and context.describe_input:
TypeError: argument of type 'NoneType' is not iterable

after looking at the json for the pipe:

{u'type': u'union', u'id': u'sw-174', u'conf': None}

PipeId: MDbjmHcE3BGSFcldouNLYQ

Offending line: https://github.com/ggaughan/pipe2py/blob/master/compile.py#L110

This happens when creating a unit test and having the json being load + parsed. Creating a working pipe from the pipeid is no problem.

ggaughan commented 13 years ago

Strange - viewing that pipe in Yahoo! Pipes shows no union module (and only one fetchpage module but there are two in the json). I've seen something similar before and was able to get compile.py to ignore the phantom module because it was orphaned (had no connections) - this seems to be a bit more corrupt though. Saving a copy of the pipe in Yahoo! seems to fix the broken definition.

gorenje commented 13 years ago

Right at the bottom, just before the output --> http://pipes.yahoo.com/pipes/pipe.edit?_id=MDbjmHcE3BGSFcldouNLYQ

And it only happens when creating a unit test from the pipe. The complete json of the pipe:


prompt> cat pipelines/pipe_MDbjmHcE3BGSFcldouNLYQ.json 
{"modules": [{"type": "output", "id": "_OUTPUT", "conf": null}, {"type": "fetchpage", "id": "sw-117", "conf": {"URL": {"type": "url", "value": "http://railscasts.com/"}, "to": {"type": "text", "value": ""}, "token": {"type": "text", "value": "<div class=\"episode\">"}, "from": {"type": "text", "value": "<div class=\"episodes\">"}}}, {"type": "filter", "id": "sw-124", "conf": {"COMBINE": {"type": "text", "value": "and"}, "MODE": {"type": "text", "value": "block"}, "RULE": {"field": {"type": "text", "value": "content"}, "value": {"type": "text", "value": "class=\"episodes\""}, "op": {"type": "text", "value": "contains"}}}}, {"type": "rename", "id": "sw-138", "conf": {"RULE": [{"newval": {"type": "text", "value": "author"}, "field": {"type": "text", "value": "content"}, "op": {"type": "text", "value": "copy"}}, {"newval": {"type": "text", "value": "description"}, "field": {"type": "text", "value": "content"}, "op": {"type": "text", "value": "copy"}}, {"newval": {"type": "text", "value": "link"}, "field": {"type": "text", "value": "content"}, "op": {"type": "text", "value": "copy"}}]}}, {"type": "regex", "id": "sw-134", "conf": {"RULE": {"field": {"type": "text", "value": "link"}, "globalmatch": {"type": "text", "value": "1"}, "match": {"type": "text", "value": ".*\"(http://[^\\s]+)\".*"}, "replace": {"type": "text", "value": "$1"}}}}, {"type": "regex", "id": "sw-156", "conf": {"RULE": {"field": {"type": "text", "value": "link"}, "globalmatch": {"type": "text", "value": "1"}, "match": {"type": "text", "value": "^[ \\t].*$"}, "replace": {"type": "text", "value": ""}}}}, {"type": "fetchpage", "id": "sw-154", "conf": {"URL": {"type": "url", "value": "http://railscasts.com"}, "to": {"type": "text", "value": ""}, "token": {"type": "text", "value": ""}, "from": {"type": "text", "value": ""}}}, {"type": "union", "id": "sw-174", "conf": null}], "wires": [{"src": {"id": "_OUTPUT", "moduleid": "sw-117"}, "id": "_w0", "tgt": {"id": "_INPUT", "moduleid": "sw-124"}}, {"src": {"id": "_OUTPUT", "moduleid": "sw-124"}, "id": "_w1", "tgt": {"id": "_INPUT", "moduleid": "sw-138"}}, {"src": {"id": "_OUTPUT", "moduleid": "sw-138"}, "id": "_w2", "tgt": {"id": "_INPUT", "moduleid": "sw-134"}}, {"src": {"id": "_OUTPUT", "moduleid": "sw-134"}, "id": "_w3", "tgt": {"id": "_INPUT", "moduleid": "sw-156"}}, {"src": {"id": "_OUTPUT", "moduleid": "sw-156"}, "id": "_w8", "tgt": {"id": "_OTHER3", "moduleid": "sw-174"}}, {"src": {"id": "_OUTPUT", "moduleid": "sw-154"}, "id": "_w6", "tgt": {"id": "_OTHER", "moduleid": "sw-174"}}, {"src": {"id": "_OUTPUT", "moduleid": "sw-174"}, "id": "_w10", "tgt": {"id": "_INPUT", "moduleid": "_OUTPUT"}}], "layout": [{"xy": ["953", "540"], "id": "_OUTPUT"}, {"xy": ["189", "89"], "id": "sw-117"}, {"xy": ["682", "103"], "id": "sw-124"}, {"xy": ["168", "223"], "id": "sw-138"}, {"xy": ["600", "256"], "id": "sw-134"}, {"xy": ["649", "393"], "id": "sw-156"}, {"xy": ["172", "379"], "id": "sw-154"}, {"xy": ["635", "523"], "id": "sw-174"}]}
ggaughan commented 13 years ago

I can't see a union module. In chrome and firefox I only see: fetchpage, filter, rename, regex, regex, output. The json does has two fetchpages and a union. Saving a copy of the pipe gets rid of them from the json.

gorenje commented 13 years ago

strange - this is what i see http://www.flickr.com/photos/40500723@N04/5690440730/

union being right at the bottom ... that was done with FF 4.0.1

ggaughan commented 13 years ago

here's what I see (on a number of different machines/browsers): http://www.flickr.com/photos/62577683@N08/5692967396/

I'll try to get it to be ignored somehow (it won't be removed by the orphan handler because it's linked by a phantom wire to the output module).

gorenje commented 13 years ago

that's wierd, it looks like you're seeing a previously saved version of the pipe but the pipe has been saved.

i guess you can use the json i posted in this ticket as a test case (the union operator is included in the json) and then you can try it via that. the json was generated from what i see, i.e. also a "legal" pipe ....

thanks for all the patience!

reubano commented 9 years ago

I just made a bunch of updates, can you check and see if the issue is still there?