Extraction API giving 403 Forbidden

MattePalte commented 3 years ago

Hi, I loaded the pre-trained model and trying the predict in interactive mode. However, the Extraction API (https://po3g2dx2qa.execute-api.us-east-1.amazonaws.com/production/extractmethods) used in the code is giving a 403 Forbidden error, like issue #13 and issue #7.

Side note: I read the reasons behind having an external endpoint and they make sense, I got another idea about launching a docker container simulating the AWS Java lambda function locally (https://docs.aws.amazon.com/lambda/latest/dg/java-image.html ), so that the code can refer to localhost and it should be always available also offline and easily. I am not an expert with AWS Java lambda, so I am not sure it is possible. Just an idea.

Thanks in advance in any case

urialon commented 3 years ago

Hi @MattePalte , Thank you for reporting this!

I'll take care of it and update you once it's fixed.

Thanks, Uri

urialon commented 3 years ago

Hi again @MattePalte , The extraction API seems to be working OK for me, both directly and from the interactive mode. You can try running:

curl -X POST https://po3g2dx2qa.execute-api.us-east-1.amazonaws.com/production/extractmethods -d '{"code":"public boolean f() { return true; }", "decompose":"true"}'

and see that it returns code 200.

Did you make any modifications to the python code that might cause this?

MattePalte commented 3 years ago

Hi @urialon, thanks for the fast reply.

I was using the default configuration. I investigated more carefully and solve the problem.

It crashed inside the requests module. called by extractor.py:

@staticmethod
    def post_request(url, code_string):
        return requests.post(url, data=json.dumps({"code": code_string, "decompose": True}, separators=(',', ':')))

https://github.com/tech-srl/code2seq/blob/master/extractor.py#L18

This is the pdb output leading to segmentation fault:

> /home/paltenmo/conda_env/tf_3.6/lib/python3.6/site-packages/requests/sessions.py(541)request()
-> send_kwargs.update(settings)
(Pdb) prep.__dict__.keys()
dict_keys(['method', 'url', 'headers', '_cookies', 'body', 'hooks', '_body_position'])
(Pdb) [print(i) for i in prep.__dict__.items()]
('method', 'POST')
('url', 'https://po3g2dx2qa.execute-api.us-east-1.amazonaws.com/production/extractmethods')
('headers', {'User-Agent': 'python-requests/2.25.1', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Content-Length': '63'})
('_cookies', <RequestsCookieJar[]>)
('body', '{"code":"public boolean f() { return true; }","decompose":true}')
('hooks', {'response': []})
('_body_position', None)
[None, None, None, None, None, None, None]
(Pdb) n
> /home/paltenmo/conda_env/tf_3.6/lib/python3.6/site-packages/requests/sessions.py(542)request()
-> resp = self.send(prep, **send_kwargs)
(Pdb) n
Segmentation fault (core dumped)

While if I use curl I get the correct result:

(/home/paltenmo/conda_env/tf_3.6) paltenmo@donkey:~/projects/Code2Seq/code2seq$ curl -X POST https://po3g2dx2qa.execute-api.us-east-1.amazonaws.com/production/extractmethods -d '{"code":"public boolean f() { return true; }", "decompose":"true"}'

[{"target":"f","paths":[{"name1":"boolean","name2":"METHOD_NAME","shortPath":"Prim0|Mth|Nm1","path":"(PrimitiveType0)^(MethodDeclaration)_(NameExpr1)","name1NodeId":2,"name2NodeId":3,"name1TokenNum":0,"name2TokenNum":0},{"name1":"boolean","name2":"true","shortPath":"Prim0|Mth|Bk|Ret|BoolEx0","path":"(PrimitiveType0)^(MethodDeclaration)_(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)","name1NodeId":2,"name2NodeId":6,"name1TokenNum":0,"name2TokenNum":0},{"name1":"METHOD_NAME","name2":"true","shortPath":"Nm1|Mth|Bk|Ret|BoolEx0","path":"(NameExpr1)^(MethodDeclaration)_(BlockStmt)_(ReturnStmt)_(BooleanLiteralExpr0)","name1NodeId":3,"name2NodeId":6,"name1TokenNum":0,"name2TokenNum":0}],"ast":{"id":0,"children":[{"id":1,"type":"MethodDeclaration","children":[{"id":2,"name":"boolean","type":"PrimitiveType"},{"id":3,"name":"f","type":"NameExpr"},{"id":4,"type":"BlockStmt","children":[{"id":5,"type":"ReturnStmt","children":[{"id":6,"name":"true","type":"BooleanLiteralExpr"}]}]}]}]}}]

It was a compatibility problem. I was using a conda environment with:

python 3.6
Tensorflow 1.14
requests (I tried multiple versions/repositories but with the same output)
rouge installed with this
```
conda install git pip
pip install git+git://github.com/pltrdy/rouge@master
```
I solved the problem changing version to Tensorflow 1.12. Maybe this part can in the README can be updated to avoid future problems "TensorFlow 1.12 or newer (install)", removing "or newer".

Thanks again for the support @urialon

urialon commented 3 years ago

Hi @MattePalte ,

Wow, this is really weird. Thanks for reporting this, I have updated the README as you suggested.

I'm glad that everything works now!

Best, Uri

tech-srl / code2seq

Extraction API giving 403 Forbidden #89