stanfordnlp / python-corenlp-protobuf

Python bindings for Stanford CoreNLP's protobufs.
MIT License
20 stars 7 forks source link

TypeError: Couldn't build proto file into descriptor pool! #5

Open XiaohanYa opened 2 years ago

XiaohanYa commented 2 years ago

Hi,

I'm using CoreNLP Client from stanza to annotate my text, and I want to save my results. I used the following code to save my results. I think it works well but I encountered errors when reading in the result file.

with open(dir_path+'reproduced/corenlp/results.pb2', 'wb') as f:
    f.write(annotated_doc.SerializeToString())

When reading the file, I tried the following code. The corenlp-protobuf I installed is version 3.8.0.

from corenlp_protobuf import Document, parseFromDelimitedString

def readCoreNLPProtoFile(protoFile):
  with open('protoFile', 'rb') as f:
      buf = f.read()
  doc = Document()
  parseFromDelimitedString(doc, buf)
  return doc

However, I got the TypeError as follows. Could someone help me? Thanks in advance.

Screen Shot 2021-11-09 at 10 01 08 PM
AngledLuffa commented 2 years ago

Are you using two separate libraries which both happen to use a protobuf to communicate with corenlp? Maybe there's a conflict, especially since the protobuf definitions aren't synced.

I would recommend entirely using stanza instead, since stanza is actively being developed and has some better support in general for corenlp.

On Tue, Nov 9, 2021 at 7:04 PM Xiaohan Yang @.***> wrote:

Hi,

I'm using CoreNLP Client from stanza to annotate my text, and I want to save my results. I used the following code to save my results. I think it works well but I encountered errors when reading in the result file.

with open(dir_path+'reproduced/corenlp/results.pb2', 'wb') as f: f.write(annotated_doc.SerializeToString())

When reading the file, I tried the following code. The corenlp-protobuf I installed is version 3.8.0.

from corenlp_protobuf import Document, parseFromDelimitedString

def readCoreNLPProtoFile(protoFile): with open('protoFile', 'rb') as f: buf = f.read() doc = Document() parseFromDelimitedString(doc, buf) return doc

However, I got the TypeError as follows. Could someone help me? Thanks in advance.

[image: Screen Shot 2021-11-09 at 10 01 08 PM] https://user-images.githubusercontent.com/26055429/141042076-16839cc2-30ea-42f5-9169-29f8271e054a.png

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/python-corenlp-protobuf/issues/5, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AYWPIVLTPHC2GJS47XZTULHOM7ANCNFSM5HW2EA6A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.