tech-srl / code2seq

Code for the model presented in the paper: "code2seq: Generating Sequences from Structured Representations of Code"
http://code2seq.org
MIT License
555 stars 164 forks source link

Extraction API giving 403 Forbidden #89

Closed MattePalte closed 3 years ago

MattePalte commented 3 years ago

Hi, I loaded the pre-trained model and trying the predict in interactive mode. However, the Extraction API (https://po3g2dx2qa.execute-api.us-east-1.amazonaws.com/production/extractmethods) used in the code is giving a 403 Forbidden error, like issue #13 and issue #7.

Side note: I read the reasons behind having an external endpoint and they make sense, I got another idea about launching a docker container simulating the AWS Java lambda function locally (https://docs.aws.amazon.com/lambda/latest/dg/java-image.html ), so that the code can refer to localhost and it should be always available also offline and easily. I am not an expert with AWS Java lambda, so I am not sure it is possible. Just an idea.

Thanks in advance in any case

urialon commented 3 years ago

Hi @MattePalte , Thank you for reporting this!

I'll take care of it and update you once it's fixed.

Thanks, Uri

urialon commented 3 years ago

Hi again @MattePalte , The extraction API seems to be working OK for me, both directly and from the interactive mode. You can try running:

curl -X POST https://po3g2dx2qa.execute-api.us-east-1.amazonaws.com/production/extractmethods -d '{"code":"public boolean f() { return true; }", "decompose":"true"}'

and see that it returns code 200.

Did you make any modifications to the python code that might cause this?

MattePalte commented 3 years ago

Hi @urialon, thanks for the fast reply.

I was using the default configuration. I investigated more carefully and solve the problem.

It crashed inside the requests module. called by extractor.py:

@staticmethod
    def post_request(url, code_string):
        return requests.post(url, data=json.dumps({"code": code_string, "decompose": True}, separators=(',', ':')))

https://github.com/tech-srl/code2seq/blob/master/extractor.py#L18

This is the pdb output leading to segmentation fault:

> /home/paltenmo/conda_env/tf_3.6/lib/python3.6/site-packages/requests/sessions.py(541)request()
-> send_kwargs.update(settings)
(Pdb) prep.__dict__.keys()
dict_keys(['method', 'url', 'headers', '_cookies', 'body', 'hooks', '_body_position'])
(Pdb) [print(i) for i in prep.__dict__.items()]
('method', 'POST')
('url', 'https://po3g2dx2qa.execute-api.us-east-1.amazonaws.com/production/extractmethods')
('headers', {'User-Agent': 'python-requests/2.25.1', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Content-Length': '63'})
('_cookies', <RequestsCookieJar[]>)
('body', '{"code":"public boolean f() { return true; }","decompose":true}')
('hooks', {'response': []})
('_body_position', None)
[None, None, None, None, None, None, None]
(Pdb) n
> /home/paltenmo/conda_env/tf_3.6/lib/python3.6/site-packages/requests/sessions.py(542)request()
-> resp = self.send(prep, **send_kwargs)
(Pdb) n
Segmentation fault (core dumped)

While if I use curl I get the correct result:

(/home/paltenmo/conda_env/tf_3.6) paltenmo@donkey:~/projects/Code2Seq/code2seq$ curl -X POST https://po3g2dx2qa.execute-api.us-east-1.amazonaws.com/production/extractmethods -d '{"code":"public boolean f() { return true; }", "decompose":"true"}'

[{"target":"f","paths":[{"name1":"boolean","name2":"METHOD_NAME","shortPath":"Prim0|Mth|Nm1","path":"(PrimitiveType0)^(MethodDeclaration)_(NameExpr1)","name1NodeId":2,"name2NodeId":3,"name1TokenNum":0,"name2TokenNum":0},{"name1":"boolean","name2":"true","shortPath":"Prim0|Mth|Bk|Ret|BoolEx0","path":"(PrimitiveType0)^(MethodDeclaration)_(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)","name1NodeId":2,"name2NodeId":6,"name1TokenNum":0,"name2TokenNum":0},{"name1":"METHOD_NAME","name2":"true","shortPath":"Nm1|Mth|Bk|Ret|BoolEx0","path":"(NameExpr1)^(MethodDeclaration)_(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)","name1NodeId":3,"name2NodeId":6,"name1TokenNum":0,"name2TokenNum":0}],"ast":{"id":0,"children":[{"id":1,"type":"MethodDeclaration","children":[{"id":2,"name":"boolean","type":"PrimitiveType"},{"id":3,"name":"f","type":"NameExpr"},{"id":4,"type":"BlockStmt","children":[{"id":5,"type":"ReturnStmt","children":[{"id":6,"name":"true","type":"BooleanLiteralExpr"}]}]}]}]}}]

It was a compatibility problem. I was using a conda environment with:

Thanks again for the support @urialon

urialon commented 3 years ago

Hi @MattePalte ,

Wow, this is really weird. Thanks for reporting this, I have updated the README as you suggested.

I'm glad that everything works now!

Best, Uri