kermitt2 / grobid_client_python

Python client for GROBID Web services
Apache License 2.0
275 stars 74 forks source link

How to add the `"segmentSentences": true` in the config file? #37

Closed xegulon closed 2 years ago

xegulon commented 2 years ago

Here is my config.json:

{
    "grobid_server": "localhost",
    "grobid_port": "8070",
    "batch_size": 1000,
    "sleep_time": 5,
    "timeout": 60,
    "coordinates": [ "persName", "figure", "ref", "biblStruct", "formula", "s" ],
    "segmentSentences": true
}

I added at the end "segmentSentences": true, but even with that, I don't see the sentence segmentation in the TEIs. How to do that?

kermitt2 commented 2 years ago

See https://github.com/kermitt2/grobid_client_python#usage-and-options it's a command line argument.

xegulon commented 2 years ago

Thanks, but I don't use the command line, I use the client python object from within my python code. How to specify the sentence segmentation parameters that way?

kermitt2 commented 2 years ago

Then it's a class method argument: https://github.com/kermitt2/grobid_client_python/blob/master/grobid_client/grobid_client.py#L460 https://github.com/kermitt2/grobid_client_python/blob/master/grobid_client/grobid_client.py#L107

xegulon commented 2 years ago

Thanks, so what I needed is:

from grobid_client.grobid_client import GrobidClient

client = GrobidClient(config_path="config.json")
client.process("processFulltextDocument", "path/to/pdfs", segment_sentences=True, n=20)