Open saisrinivas047 opened 6 years ago
As described in README, try this:
cat input.txt | python punctuator.py <model_path> <model_output_path>
Change input.txt
to your input text file that will be punctuated. Change <model_path>
to the location where you save the pre-trained .pcl model file. Then change <model_output_path>
to desired punctuated output file.
I am getting some gibberish output when I try to run the model on my input file. The script runs with some random output as you can see in the image. Can you help me run the model on my training data? The README doesn't help much for beginners.
Or can you give me a sample input for the
@acerock6 I got it working. I cloned the repo and created the folder ./data
. in my project root, and then downloaded the file Demo-Europarl-EN.pcl
into my ./data
folder.
I then created a sample file called test.txt
and added some unpunctuated text. Then I ran:
cat test.txt | python punctuator.py ./data/Demo-Europarl-EN.pcl output.txt
It ran for about 5 minutes, but finally generated the file output.txt
containing my sample text, but with punctutation symbols added to it. Note, this output isn't strictly readable. It just inserts strings like ".PERIOD" into your text to denote a period. To convert this to the final form, you need to run python convert_to_readable.py output.txt output2.txt
to convert the punctuation symbols to normal punctuation.
Hi @ottokart I am new to this. I downloaded the .pcl pre-trained model. Can someone tell me how to use this file to add puntuations to text