FerdinandZhong / punctuator

A small seq2seq punctuator tool based on DistilBERT
Apache License 2.0
49 stars 6 forks source link

So many examples but no example for punctuate #11

Open FurkanGozukara opened 1 year ago

FurkanGozukara commented 1 year ago

Hello. I want to punctate big chunk of text. E.g. like below. How can I do that? Thank you

Could you write a simple python code to punctate text below?

The text is from my lecture video (https://www.youtube.com/watch?v=_nKwisL8dTs) which I am trying to generate subtitles. Whisper does very well but fails to punctate at some parts.

okay sorry about this confusion what I did is when I have forgotten to unpause the video is simply I have coded a test button and the test button is using our original static file cmd and gif file cmd and I also fixed something in gif file cmd which is I have removed the loses command because it was giving an error now they are working I am using a wait for exists so let me show you how it works okay okay let me start test so the first process is started it is taking some time because that image is pretty big then it is starting the other one and now they are generated okay so you see original file is 820 kilobytes and let's see how much did we gain okay so 820 minus 572 over 820 you see 30 percent gain we have in this file it is significant and it has zero difference how can I be so sure about that we can be sure about that with a comparison okay so I am going to only make a single line of single pixel of difference here on this web p file and I will save it as a test on my desktop here as a png so I will name it as test to png okay and then I will save my original file as test png on the desktop here then I will use online comparison website let me show you compare image difference okay there are several pages for that so first try with diff checker diff checker is awesome website believe me okay so when I see check the difference there is a single line of difference here on this image so how they achieve this I wonder yeah so here when I hover and when I zoom in okay like this you see there is a single line a single pixel of difference here and no other differences it is exactly same and let's compare with another website okay online diff so first image and the second image so I will make the fuzziness zero and it will show as a red color okay so on this image there is a single pixel difference here which is what I have made and there is no other red dot okay so I can copy this image to zoom in so you see there is no other red dot because they are exactly same except the single line single pixel that I have made myself so basically we gain 35 percent 30 percent size in this image and on this gift image we gained from minus to 26.9 over this 35 percent you see with on the gift image we gain 35 percent and let's test if they are working or not so this is our WebP GIF and this is our iponic GIF this is original GIF file and this is WebP file they are looking pretty much same to me we can also use some online websites online GIF to WebP there is one website which I have found working very well this one or yeah let's try this I think it was this one so let's open our debug test so here our GIF upload it then you see there is losing compression mixed compression I unmark them and convert the WebP so this website generated a little bit higher kilobyte because probably it is not using the best compression and that's it okay so we are able to properly convert GIF and static PNG and probably GPX as well we haven't tested GPX so let's also test the GPX for example yeah this wallpaper it's pretty big so it will probably take a lot of time okay let's copy and paste this okay so I will remove this probably we don't even need it right now what is the file name it is this I am not sure if it if it can produce better than GPX because GPX is already losing compression as you know okay let's try it so all processes started at the same time because we are not waiting them and they are running right now as they get completed it will close the window and why it takes so long is that we are using the best possible algorithm and let's see the output okay so yes the WebP file is bigger than the original GPX it is because GPX is already losing and when I save this GPX as a PNG let's see the size okay size of the PNG is this we can of course optimize it a little bit more with PNG out win and I am pretty sure there will be still significant difference between PNG version and WebP version this is a software that I have purchased to optimize my PNG files previously but it is not anymore necessary because now we can use WebP format which is much better format okay so this software is single threaded on a single image so it is taking some time it has so many passes okay so the optimized PNG file is 2.53 megabytes and minus 1.52 megabytes over or not this one actually since GPX files are already losing we shouldn't convert them to WebP probably we we cannot we cannot achieve same quality I wonder if there is an losing but no point of converting GPX into WebP let me check that first okay okay same quality for GPX I think we need to have some losing compression probably for GPX compression we need to use some other methodology so let's see which which options we can use okay let's see okay so there is version loses near loses int so we can use near loses for GPX I think okay so which which option should we use I'm not sure I think I will try near loses yeah let's try it with so for that I'm going to have another file it will be for GPX for GPX I'm going to remove loses and change it with near loses with zero and I think I have to remove z9 as well so yeah I have to remove z9 okay let's try this way for GPX okay and this is the file name okay let's test GPX SR or a GPX and let's comment out this is and let's make it like this yeah okay let's see what kind of results we are going to get with GPX command okay so it is done oh wow now we have a better result than original GPX so let's compare two images quality of course I am not expecting them to be same yeah I can see the difference there is already some difference but I am not sure if we have lost some quality or not yeah we have lost some quality as you can see definitely and it is not small as well okay I wonder if it is possible to compress GPX losing quality is this even possible I'm not sure compress GPX okay okay

FerdinandZhong commented 1 year ago

Hi, you can refer to examples/english_inference_sample.py for using the punctuator to add punctuations. Inference samples are exactly examples about how to use the trained model to do prediction. If you're not satisfied with the performance of the model provided in the repo, you can use train example to re-train a model.

You just need to modify the code in examples/english_inference_sample.py to add punctuations to your own text. If you want to write to script to continuously adding punctuation, you can launch the punctuator and call its function. Sample is showed below:

import logging

from dbpunctuator.inference import Inference, InferenceArguments
from dbpunctuator.utils import ALL_PUNCS, DEFAULT_ENGLISH_TAG_PUNCTUATOR_MAP
from dbpunctuator.utils.utils import register_logger

logger = logging.getLogger(__name__)
register_logger(logger)

def produce_sample_text(text, repl=None):
    puncs = dict(zip(ALL_PUNCS, [repl] * len(ALL_PUNCS)))
    return text.lower().translate(puncs)

example = """
okay sorry about this confusion what I did is when I have forgotten to unpause the video is simply I have coded a test button and the test button is using our original static file cmd and gif file cmd and I also fixed something in gif file cmd which is I have removed the loses command because it was giving an error now they are working I am using a wait for exists so let me show you how it works okay okay let me start test so the first process is started it is taking some time because that image is pretty big then it is starting the other one and now they are generated okay so you see original file is 820 kilobytes and let's see how much did we gain okay so 820 minus 572 over 820 you see 30 percent gain we have in this file it is significant and it has zero difference how can I be so sure about that we can be sure about that with a comparison okay so I am going to only make a single line of single pixel of difference here on this web p file and I will save it as a test on my desktop here as a png so I will name it as test to png okay and then I will save my original file as test png on the desktop here then I will use online comparison website let me show you compare image difference okay there are several pages for that so first try with diff checker diff checker is awesome website believe me okay so when I see check the difference there is a single line of difference here on this image so how they achieve this I wonder yeah so here when I hover and when I zoom in okay like this you see there is a single line a single pixel of difference here and no other differences it is exactly same and let's compare with another website okay online diff so first image and the second image so I will make the fuzziness zero and it will show as a red color okay so on this image there is a single pixel difference here which is what I have made and there is no other red dot okay so I can copy this image to zoom in so you see there is no other red dot because they are exactly same except the single line single pixel that I have made myself so basically we gain 35 percent 30 percent size in this image and on this gift image we gained from minus to 26.9 over this 35 percent you see with on the gift image we gain 35 percent and let's test if they are working or not so this is our WebP GIF and this is our iponic GIF this is original GIF file and this is WebP file they are looking pretty much same to me we can also use some online websites online GIF to WebP there is one website which I have found working very well this one or yeah let's try this I think it was this one so let's open our debug test so here our GIF upload it then you see there is losing compression mixed compression I unmark them and convert the WebP so this website generated a little bit higher kilobyte because probably it is not using the best compression and that's it okay so we are able to properly convert GIF and static PNG and probably GPX as well we haven't tested GPX so let's also test the GPX for example yeah this wallpaper it's pretty big so it will probably take a lot of time okay let's copy and paste this okay so I will remove this probably we don't even need it right now what is the file name it is this I am not sure if it if it can produce better than GPX because GPX is already losing compression as you know okay let's try it so all processes started at the same time because we are not waiting them and they are running right now as they get completed it will close the window and why it takes so long is that we are using the best possible algorithm and let's see the output okay so yes the WebP file is bigger than the original GPX it is because GPX is already losing and when I save this GPX as a PNG let's see the size okay size of the PNG is this we can of course optimize it a little bit more with PNG out win and I am pretty sure there will be still significant difference between PNG version and WebP version this is a software that I have purchased to optimize my PNG files previously but it is not anymore necessary because now we can use WebP format which is much better format okay so this software is single threaded on a single image so it is taking some time it has so many passes okay so the optimized PNG file is 2.53 megabytes and minus 1.52 megabytes over or not this one actually since GPX files are already losing we shouldn't convert them to WebP probably we we cannot we cannot achieve same quality I wonder if there is an losing but no point of converting GPX into WebP let me check that first okay okay same quality for GPX I think we need to have some losing compression probably for GPX compression we need to use some other methodology so let's see which which options we can use okay let's see okay so there is version loses near loses int so we can use near loses for GPX I think okay so which which option should we use I'm not sure I think I will try near loses yeah let's try it with so for that I'm going to have another file it will be for GPX for GPX I'm going to remove loses and change it with near loses with zero and I think I have to remove z9 as well so yeah I have to remove z9 okay let's try this way for GPX okay and this is the file name okay let's test GPX SR or a GPX and let's comment out this is and let's make it like this yeah okay let's see what kind of results we are going to get with GPX command okay so it is done oh wow now we have a better result than original GPX so let's compare two images quality of course I am not expecting them to be same yeah I can see the difference there is already some difference but I am not sure if we have lost some quality or not yeah we have lost some quality as you can see definitely and it is not small as well okay I wonder if it is possible to compress GPX losing quality is this even possible I'm not sure compress GPX okay okay
"""

if __name__ == "__main__":
    args = InferenceArguments(
        model_name_or_path="Qishuai/distilbert_punctuator_en",
        tokenizer_name="Qishuai/distilbert_punctuator_en",
        tag2punctuator=DEFAULT_ENGLISH_TAG_PUNCTUATOR_MAP,
        gpu_device=1,
    )

    inference = Inference(inference_args=args, verbose=False)

    logger.info(f"testing result {inference.punctuation([example])[0]}")

    inference.terminate()

Above content's output:

"Okay, sorry about this confusion. What I did is when I have forgotten to unpause. The video is simply I have coded a test button, and the test button is using our original static file cmd and gif file, cmd. And I also fixed something in gif file cmd, which is I have removed the loses command because it was giving an error. Now they are working. I am using a wait for exists. So let me show you how it works. Okay, okay, let me start test. So the first process is started it is taking some time because that image is pretty big. Then it is starting the other one, and now they are generated. Okay, so you see, original file is 820 kilobytes, and let's see how much did we gain. Okay, so 820 minus 572 over 820. You see 30 percent gain we have in this file. It is significant, and it has zero difference. How can I be so sure about that? We can be sure about that with a comparison. Okay, so I am going to only make a single line of single pixel of difference here on this web p file, and I will save it as a test on my desktop here as a png. So I will name it as test to png. Okay, and then I will save my original file as test png on the desktop here, then I will use online comparison website. Let me show you compare image difference. Okay, there are severalPages for that. So first try with diff checker. Diff checker is awesome website. Believe me. Okay, so when I see, check the difference, there is a single line of difference here on this image. So how they achieve this? I wonder. Yeah. So here, when I hover, and when I zoom in, okay, like this, you see, there is a single line, a single pixel of difference here and no other differences. It is exactly same. And let's compare with another website. Okay, online diff so first image and the second image, so I will make the fuzziness zero, and it will show as a red color. Okay, so on this image, there is a single pixel difference here, which is what I have made, and there is no other red dot. Okay, so I can copy this image to zoom in. So you see, there is no other red dot, because they are exactly same, except the single line, single pixel that I have made myself. So basically we gain 35 percent, 30 percent size in this image. And on this gift image, we gained from minus to 26.9 over this 35 percent you see with, on the gift image, we gain 35 percent, and let's test if they are working or not. So this is our WebP GIF. And this is our iponic GIF. This is original GIF file, and this is WebP file. They are looking pretty much same to me. We can also use some online websites online GIF to WebPThere is one website which I have found working very well. This one, or yeah, let's try this. I think it was this one. So let's open our debug test. So here, our GIF upload it, then you see, there is losing compression, mixed compression, I unmark them and convert the WebP. So this website generated a little bit higher kilobyte, because probably it is not using the best compression, and that's it. Okay, so we are able to properly convert GIF and static PNG, and probably GPX as well. We haven't tested GPX. So let's also test the GPX, for example. Yeah, this wallpaper, it's pretty big, so it will probably take a lot of time. Okay, let's copy and paste this. Okay, so I will remove this. Probably we don't even need it right now. What is the file name? It is this? I am not sure if it, if it can produce better than GPX, because GPX is already losing compression, as you know. Okay, let's try it. So all processes started at the same time, because we are not waiting them and they are running right now, as they get completed, it will close the window. And why it takes so long is that we are using the best possible algorithm, and let's see the output. Okay, so yes, the WebP file is bigger than the original GPX, it is because GPX is already losing. And when I save this GPX as a PNG, let's see the size, okay, size of thePng is this we can, of course optimize it a little bit more with PNG out win. And I am pretty sure there will be still significant difference between PNG version and WebP version. This is a software that I have purchased to optimize my PNG files previously. But it is not anymore necessary, because now we can use WebP format, which is much better format. Okay, so this software is single threaded on a single image, so it is taking some time. It has so many passes. Okay, so the optimized PNG file is 2.53 megabytes and minus 1.52 megabytes over or not this one. Actually, since GPX files are already losing, we shouldn't convert them to WebP, probably we we cannot. We cannot achieve same quality. I wonder if there is an losing but no point of converting GPX into WebP. Let me check that first. Okay, okay, same quality for GPX, I think we need to have some losing compression, probably for GPX compression, we need to use some other methodology. So let's see which which options we can use. Okay, let's see. Okay, so there is version loses near loses int so we can use, near loses. For GPX, I think okay, so which? Which option should we use? I'm not sure I think I will try near loses. Yeah, let's try it with so for that, I'm going to have another file it will be for GPX. For GPX, I'm going to remove, loses and change it with near loses withZero and I think I have to remove z9 as well so yeah I have to remove z9 okay let's try this way for GPX okay and this is the file name okay let's test GPX SR or a GPX and let's comment out this is and let's make it like this yeah okay let's see what kind of results we are going to get with GPX command. Okay so it is done. Oh wow now we have a better result than original GPX so let's compare two images quality of course I am not expecting them to be same. Yeah, I can see the difference there is already some difference, but I am not sure if we have lost some quality or not. Yeah, we have lost some quality, as you can see, definitely and it is not small as well. Okay, I wonder if it is possible to compress GPX losing quality. Is this even possible? I'm not sure compress GPX okay, okay,"
FurkanGozukara commented 1 year ago

@FerdinandZhong ty very much for example. Your output pretty decent and would make my job much easier.

Do you have a pip install so that I can directly use your example code in visual studio? Or do you know how can I add your project to Python virtual environment in visual studio in windows 10

I am a C# programmer so I have very little knowledge of Python

e.g. i want to use like this

image

FerdinandZhong commented 1 year ago

Hi, you can install the package by pip install distilbert-punctuator.

Then you can use it as shown in the example.

FerdinandZhong commented 1 year ago

Hi @FurkanGozukara, may I know if you have managed to use the punctuator? If yes, I will close the issue

FurkanGozukara commented 1 year ago

Hi @FurkanGozukara, may I know if you have managed to use the punctuator? If yes, I will close the issue

thank you so much for the follow up

i get this error any ideas? I have RTX 3060 and i run some other models with CUDA fine

i copy pasted your example code

CUDA error: invalid device ordinal CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

image

image

inference arguments from debugging like below

InferenceArguments(model_name_or_path='Qishuai/distilbert_punctuator_en', tokenizer_name='Qishuai/distilbert_punctuator_en', tag2punctuator={'O': ('', False), 'COMMA': (',', False), 'PERIOD': ('.', True), 'QUESTIONMARK': ('?', True), 'EXLAMATIONMARK': ('!', True)}, tag2id_storage_path=None, gpu_device=1)

FurkanGozukara commented 1 year ago

cuda 0 and cuda 1 makes some diff but still error

image

image

FerdinandZhong commented 1 year ago

Hi @FurkanGozukara I have checked your issue and retried the example in my linux machine which works.

unfortunately, as mentioned by torch official, currently multiprocessing with torch tensors in windows is not supported as shown in below image.

image

I will go and check how to overcome this issue and release a newer version soon.

I will keep this issue open for now.

FurkanGozukara commented 1 year ago

@FerdinandZhong awesome ty very much.

I use my GPU on Whisper and it works great as i shown in the below videos

How Good is RTX 3060 for ML AI Deep Learning Tasks and Comparison With GTX 1050 Ti and i7 10700F CPU https://www.youtube.com/watch?v=q8Q8CCDdSKo

How to do Free Speech-to-Text Transcription Better Than Google Premium API with OpenAI Whisper Model https://www.youtube.com/watch?v=msj3wuYf3d8

by the way currently can i use it with CPU instead of GPU?

FerdinandZhong commented 1 year ago

@FerdinandZhong awesome ty very much.

I use my GPU on Whisper and it works great as i shown in the below videos

How Good is RTX 3060 for ML AI Deep Learning Tasks and Comparison With GTX 1050 Ti and i7 10700F CPU https://www.youtube.com/watch?v=q8Q8CCDdSKo

How to do Free Speech-to-Text Transcription Better Than Google Premium API with OpenAI Whisper Model https://www.youtube.com/watch?v=msj3wuYf3d8

by the way currently can i use it with CPU instead of GPU?

Hi @FurkanGozukara glad to hear it works for your video.

Yes, you can also use CPU instead, however the inference speed might be a little bit slower.

FurkanGozukara commented 1 year ago

@FerdinandZhong so how do we use CPU instead of GPU?

I tried several parameters but all failed with CUDA error

FerdinandZhong commented 1 year ago

@FerdinandZhong so how do we use CPU instead of GPU?

I tried several parameters but all failed with CUDA error

Just simply run it in a machine without any GPU cards. In the current version, it will check if the cuda is available as shown below:

if torch.cuda.is_available():
    self.device = torch.device(f"cuda:{inference_arguments.gpu_device}")
    logger.info(f"device type: {self.device.type}")
else:
    self.device = torch.device("cpu")

This behaviour can also be optimised as to include CPU as an option to the inference arguments.

FurkanGozukara commented 1 year ago

@FerdinandZhong i removed the code and it works really fast on CPU as well. But you should support CPU argument too

so the result is printed on the screen but how can I save the result into a text file?


2022-10-21 11:36:29,524 - ←[38;21mDEBUG←[0m - inference_pipeline.py:42 - inference_pipeline.wrapper - 54728 - After post_process, outputs is generated as "["Okay, sorry about this confusion. What I did is when I have forgotten to unpause. The video is simply I have coded a test button, and the test button is using our original static file cmd and gif file, cmd. And I also fixed something in gif file cmd, which is I have removed the loses command because it was giving an error. Now they are working. I am using a wait for exists. So let me show you how it works. Okay, okay, let me start test. So the first process is started it is taking some time because that image is pretty big. Then it is starting the other one, and now they are generated. Okay, so you see, original file is 820 kilobytes, and let's see how much did we gain. Okay, so 820 minus 572 over 820. You see 30 percent gain we have in this file. It is significant, and it has zero difference. How can I be so sure about that? We can be sure about that with a comparison. Okay, so I am going to only make a single line of single pixel of difference here on this web p file, and I will save it as a test on my desktop here as a png. So I will name it as test to png. Okay, and then I will save my original file as test png on the desktop here, then I will use online comparison website. Let me show you compare image difference. Okay, there are severalPages for that. So first try with diff checker. Diff checker is awesome website. Believe me. Okay, so when I see, check the difference, there is a single line of difference here on this image. So how they achieve this? I wonder. Yeah. So here, when I hover, and when I zoom in, okay, like this, you see, there is a single line, a single pixel of difference here and no other differences. It is exactly same. And let's compare with another website. Okay, online diff so first image and the second image, so I will make the fuzziness zero, and it will show as a red color. Okay, so on this image, there is a single pixel difference here, which is what I have made, and there is no other red dot. Okay, so I can copy this image to zoom in. So you see, there is no other red dot, because they are exactly same, except the single line, single pixel that I have made myself. So basically we gain 35 percent, 30 percent size in this image. And on this gift image, we gained from minus to 26.9 over this 35 percent you see with, on the gift image, we gain 35 percent, and let's test if they are working or not. So this is our WebP GIF. And this is our iponic GIF. This is original GIF file, and this is WebP file. They are looking pretty much same to me. We can also use some online websites online GIF to WebPThere is one website which I have found working very well. This one, or yeah, let's try this. I think it was this one. So let's open our debug test. So here, our GIF upload it, then you see, there is losing compression, mixed compression, I unmark them and convert the WebP. So this website generated a little bit higher kilobyte, because probably it is not using the best compression, and that's it. Okay, so we are able to properly convert GIF and static PNG, and probably GPX as well. We haven't tested GPX. So let's also test the GPX, for example. Yeah, this wallpaper, it's pretty big, so it will probably take a lot of time. Okay, let's copy and paste this. Okay, so I will remove this. Probably we don't even need it right now. What is the file name? It is this? I am not sure if it, if it can produce better than GPX, because GPX is already losing compression, as you know. Okay, let's try it. So all processes started at the same time, because we are not waiting them and they are running right now, as they get completed, it will close the window. And why it takes so long is that we are using the best possible algorithm, and let's see the output. Okay, so yes, the WebP file is bigger than the original GPX, it is because GPX is already losing. And when I save this GPX as a PNG, let's see the size, okay, size of thePng is this we can, of course optimize it a little bit more with PNG out win. And I am pretty sure there will be still significant difference between PNG version and WebP version. This is a software that I have purchased to optimize my PNG files previously. But it is not anymore necessary, because now we can use WebP format, which is much better format. Okay, so this software is single threaded on a single image, so it is taking some time. It has so many passes. Okay, so the optimized PNG file is 2.53 megabytes and minus 1.52 megabytes over or not this one. Actually, since GPX files are already losing, we shouldn't convert them to WebP, probably we we cannot. We cannot achieve same quality. I wonder if there is an losing but no point of converting GPX into WebP. Let me check that first. Okay, okay, same quality for GPX, I think we need to have some losing compression, probably for GPX compression, we need to use some other methodology. So let's see which which options we can use. Okay, let's see. Okay, so there is version loses near loses int so we can use, near loses. For GPX, I think okay, so which? Which option should we use? I'm not sure I think I will try near loses. Yeah, let's try it with so for that, I'm going to have another file it will be for GPX. For GPX, I'm going to remove, loses and change it with near loses withZero and I think I have to remove z9 as well so yeah I have to remove z9 okay let's try this way for GPX okay and this is the file name okay let's test GPX SR or a GPX and let's comment out this is and let's make it like this yeah okay let's see what kind of results we are going to get with GPX command. Okay so it is done. Oh wow now we have a better result than original GPX so let's compare two images quality of course I am not expecting them to be same. Yeah, I can see the difference there is already some difference, but I am not sure if we have lost some quality or not. Yeah, we have lost some quality, as you can see, definitely and it is not small as well. Okay, I wonder if it is possible to compress GPX losing quality. Is this even possible? I'm not sure compress GPX okay, okay,"]"
2022-10-21 11:36:29,525 - ←[32mINFO←[0m - _03_distilbert_punctuator.py:40 - _03_distilbert_punctuator.<module> - 54236 - testing result ["Okay, sorry about this confusion. What I did is when I have forgotten to unpause. The video is simply I have coded a test button, and the test button is using our original static file cmd and gif file, cmd. And I also fixed something in gif file cmd, which is I have removed the loses command because it was giving an error. Now they are working. I am using a wait for exists. So let me show you how it works. Okay, okay, let me start test. So the first process is started it is taking some time because that image is pretty big. Then it is starting the other one, and now they are generated. Okay, so you see, original file is 820 kilobytes, and let's see how much did we gain. Okay, so 820 minus 572 over 820. You see 30 percent gain we have in this file. It is significant, and it has zero difference. How can I be so sure about that? We can be sure about that with a comparison. Okay, so I am going to only make a single line of single pixel of difference here on this web p file, and I will save it as a test on my desktop here as a png. So I will name it as test to png. Okay, and then I will save my original file as test png on the desktop here, then I will use online comparison website. Let me show you compare image difference. Okay, there are severalPages for that. So first try with diff checker. Diff checker is awesome website. Believe me. Okay, so when I see, check the difference, there is a single line of difference here on this image. So how they achieve this? I wonder. Yeah. So here, when I hover, and when I zoom in, okay, like this, you see, there is a single line, a single pixel of difference here and no other differences. It is exactly same. And let's compare with another website. Okay, online diff so first image and the second image, so I will make the fuzziness zero, and it will show as a red color. Okay, so on this image, there is a single pixel difference here, which is what I have made, and there is no other red dot. Okay, so I can copy this image to zoom in. So you see, there is no other red dot, because they are exactly same, except the single line, single pixel that I have made myself. So basically we gain 35 percent, 30 percent size in this image. And on this gift image, we gained from minus to 26.9 over this 35 percent you see with, on the gift image, we gain 35 percent, and let's test if they are working or not. So this is our WebP GIF. And this is our iponic GIF. This is original GIF file, and this is WebP file. They are looking pretty much same to me. We can also use some online websites online GIF to WebPThere is one website which I have found working very well. This one, or yeah, let's try this. I think it was this one. So let's open our debug test. So here, our GIF upload it, then you see, there is losing compression, mixed compression, I unmark them and convert the WebP. So this website generated a little bit higher kilobyte, because probably it is not using the best compression, and that's it. Okay, so we are able to properly convert GIF and static PNG, and probably GPX as well. We haven't tested GPX. So let's also test the GPX, for example. Yeah, this wallpaper, it's pretty big, so it will probably take a lot of time. Okay, let's copy and paste this. Okay, so I will remove this. Probably we don't even need it right now. What is the file name? It is this? I am not sure if it, if it can produce better than GPX, because GPX is already losing compression, as you know. Okay, let's try it. So all processes started at the same time, because we are not waiting them and they are running right now, as they get completed, it will close the window. And why it takes so long is that we are using the best possible algorithm, and let's see the output. Okay, so yes, the WebP file is bigger than the original GPX, it is because GPX is already losing. And when I save this GPX as a PNG, let's see the size, okay, size of thePng is this we can, of course optimize it a little bit more with PNG out win. And I am pretty sure there will be still significant difference between PNG version and WebP version. This is a software that I have purchased to optimize my PNG files previously. But it is not anymore necessary, because now we can use WebP format, which is much better format. Okay, so this software is single threaded on a single image, so it is taking some time. It has so many passes. Okay, so the optimized PNG file is 2.53 megabytes and minus 1.52 megabytes over or not this one. Actually, since GPX files are already losing, we shouldn't convert them to WebP, probably we we cannot. We cannot achieve same quality. I wonder if there is an losing but no point of converting GPX into WebP. Let me check that first. Okay, okay, same quality for GPX, I think we need to have some losing compression, probably for GPX compression, we need to use some other methodology. So let's see which which options we can use. Okay, let's see. Okay, so there is version loses near loses int so we can use, near loses. For GPX, I think okay, so which? Which option should we use? I'm not sure I think I will try near loses. Yeah, let's try it with so for that, I'm going to have another file it will be for GPX. For GPX, I'm going to remove, loses and change it with near loses withZero and I think I have to remove z9 as well so yeah I have to remove z9 okay let's try this way for GPX okay and this is the file name okay let's test GPX SR or a GPX and let's comment out this is and let's make it like this yeah okay let's see what kind of results we are going to get with GPX command. Okay so it is done. Oh wow now we have a better result than original GPX so let's compare two images quality of course I am not expecting them to be same. Yeah, I can see the difference there is already some difference, but I am not sure if we have lost some quality or not. Yeah, we have lost some quality, as you can see, definitely and it is not small as well. Okay, I wonder if it is possible to compress GPX losing quality. Is this even possible? I'm not sure compress GPX okay, okay,"]
FerdinandZhong commented 1 year ago

@FerdinandZhong i removed the code and it works really fast on CPU as well. But you should support CPU argument too

so the result is printed on the screen but how can I save the result into a text file?


2022-10-21 11:36:29,524 - ←[38;21mDEBUG←[0m - inference_pipeline.py:42 - inference_pipeline.wrapper - 54728 - After post_process, outputs is generated as "["Okay, sorry about this confusion. What I did is when I have forgotten to unpause. The video is simply I have coded a test button, and the test button is using our original static file cmd and gif file, cmd. And I also fixed something in gif file cmd, which is I have removed the loses command because it was giving an error. Now they are working. I am using a wait for exists. So let me show you how it works. Okay, okay, let me start test. So the first process is started it is taking some time because that image is pretty big. Then it is starting the other one, and now they are generated. Okay, so you see, original file is 820 kilobytes, and let's see how much did we gain. Okay, so 820 minus 572 over 820. You see 30 percent gain we have in this file. It is significant, and it has zero difference. How can I be so sure about that? We can be sure about that with a comparison. Okay, so I am going to only make a single line of single pixel of difference here on this web p file, and I will save it as a test on my desktop here as a png. So I will name it as test to png. Okay, and then I will save my original file as test png on the desktop here, then I will use online comparison website. Let me show you compare image difference. Okay, there are severalPages for that. So first try with diff checker. Diff checker is awesome website. Believe me. Okay, so when I see, check the difference, there is a single line of difference here on this image. So how they achieve this? I wonder. Yeah. So here, when I hover, and when I zoom in, okay, like this, you see, there is a single line, a single pixel of difference here and no other differences. It is exactly same. And let's compare with another website. Okay, online diff so first image and the second image, so I will make the fuzziness zero, and it will show as a red color. Okay, so on this image, there is a single pixel difference here, which is what I have made, and there is no other red dot. Okay, so I can copy this image to zoom in. So you see, there is no other red dot, because they are exactly same, except the single line, single pixel that I have made myself. So basically we gain 35 percent, 30 percent size in this image. And on this gift image, we gained from minus to 26.9 over this 35 percent you see with, on the gift image, we gain 35 percent, and let's test if they are working or not. So this is our WebP GIF. And this is our iponic GIF. This is original GIF file, and this is WebP file. They are looking pretty much same to me. We can also use some online websites online GIF to WebPThere is one website which I have found working very well. This one, or yeah, let's try this. I think it was this one. So let's open our debug test. So here, our GIF upload it, then you see, there is losing compression, mixed compression, I unmark them and convert the WebP. So this website generated a little bit higher kilobyte, because probably it is not using the best compression, and that's it. Okay, so we are able to properly convert GIF and static PNG, and probably GPX as well. We haven't tested GPX. So let's also test the GPX, for example. Yeah, this wallpaper, it's pretty big, so it will probably take a lot of time. Okay, let's copy and paste this. Okay, so I will remove this. Probably we don't even need it right now. What is the file name? It is this? I am not sure if it, if it can produce better than GPX, because GPX is already losing compression, as you know. Okay, let's try it. So all processes started at the same time, because we are not waiting them and they are running right now, as they get completed, it will close the window. And why it takes so long is that we are using the best possible algorithm, and let's see the output. Okay, so yes, the WebP file is bigger than the original GPX, it is because GPX is already losing. And when I save this GPX as a PNG, let's see the size, okay, size of thePng is this we can, of course optimize it a little bit more with PNG out win. And I am pretty sure there will be still significant difference between PNG version and WebP version. This is a software that I have purchased to optimize my PNG files previously. But it is not anymore necessary, because now we can use WebP format, which is much better format. Okay, so this software is single threaded on a single image, so it is taking some time. It has so many passes. Okay, so the optimized PNG file is 2.53 megabytes and minus 1.52 megabytes over or not this one. Actually, since GPX files are already losing, we shouldn't convert them to WebP, probably we we cannot. We cannot achieve same quality. I wonder if there is an losing but no point of converting GPX into WebP. Let me check that first. Okay, okay, same quality for GPX, I think we need to have some losing compression, probably for GPX compression, we need to use some other methodology. So let's see which which options we can use. Okay, let's see. Okay, so there is version loses near loses int so we can use, near loses. For GPX, I think okay, so which? Which option should we use? I'm not sure I think I will try near loses. Yeah, let's try it with so for that, I'm going to have another file it will be for GPX. For GPX, I'm going to remove, loses and change it with near loses withZero and I think I have to remove z9 as well so yeah I have to remove z9 okay let's try this way for GPX okay and this is the file name okay let's test GPX SR or a GPX and let's comment out this is and let's make it like this yeah okay let's see what kind of results we are going to get with GPX command. Okay so it is done. Oh wow now we have a better result than original GPX so let's compare two images quality of course I am not expecting them to be same. Yeah, I can see the difference there is already some difference, but I am not sure if we have lost some quality or not. Yeah, we have lost some quality, as you can see, definitely and it is not small as well. Okay, I wonder if it is possible to compress GPX losing quality. Is this even possible? I'm not sure compress GPX okay, okay,"]"
2022-10-21 11:36:29,525 - ←[32mINFO←[0m - _03_distilbert_punctuator.py:40 - _03_distilbert_punctuator.<module> - 54236 - testing result ["Okay, sorry about this confusion. What I did is when I have forgotten to unpause. The video is simply I have coded a test button, and the test button is using our original static file cmd and gif file, cmd. And I also fixed something in gif file cmd, which is I have removed the loses command because it was giving an error. Now they are working. I am using a wait for exists. So let me show you how it works. Okay, okay, let me start test. So the first process is started it is taking some time because that image is pretty big. Then it is starting the other one, and now they are generated. Okay, so you see, original file is 820 kilobytes, and let's see how much did we gain. Okay, so 820 minus 572 over 820. You see 30 percent gain we have in this file. It is significant, and it has zero difference. How can I be so sure about that? We can be sure about that with a comparison. Okay, so I am going to only make a single line of single pixel of difference here on this web p file, and I will save it as a test on my desktop here as a png. So I will name it as test to png. Okay, and then I will save my original file as test png on the desktop here, then I will use online comparison website. Let me show you compare image difference. Okay, there are severalPages for that. So first try with diff checker. Diff checker is awesome website. Believe me. Okay, so when I see, check the difference, there is a single line of difference here on this image. So how they achieve this? I wonder. Yeah. So here, when I hover, and when I zoom in, okay, like this, you see, there is a single line, a single pixel of difference here and no other differences. It is exactly same. And let's compare with another website. Okay, online diff so first image and the second image, so I will make the fuzziness zero, and it will show as a red color. Okay, so on this image, there is a single pixel difference here, which is what I have made, and there is no other red dot. Okay, so I can copy this image to zoom in. So you see, there is no other red dot, because they are exactly same, except the single line, single pixel that I have made myself. So basically we gain 35 percent, 30 percent size in this image. And on this gift image, we gained from minus to 26.9 over this 35 percent you see with, on the gift image, we gain 35 percent, and let's test if they are working or not. So this is our WebP GIF. And this is our iponic GIF. This is original GIF file, and this is WebP file. They are looking pretty much same to me. We can also use some online websites online GIF to WebPThere is one website which I have found working very well. This one, or yeah, let's try this. I think it was this one. So let's open our debug test. So here, our GIF upload it, then you see, there is losing compression, mixed compression, I unmark them and convert the WebP. So this website generated a little bit higher kilobyte, because probably it is not using the best compression, and that's it. Okay, so we are able to properly convert GIF and static PNG, and probably GPX as well. We haven't tested GPX. So let's also test the GPX, for example. Yeah, this wallpaper, it's pretty big, so it will probably take a lot of time. Okay, let's copy and paste this. Okay, so I will remove this. Probably we don't even need it right now. What is the file name? It is this? I am not sure if it, if it can produce better than GPX, because GPX is already losing compression, as you know. Okay, let's try it. So all processes started at the same time, because we are not waiting them and they are running right now, as they get completed, it will close the window. And why it takes so long is that we are using the best possible algorithm, and let's see the output. Okay, so yes, the WebP file is bigger than the original GPX, it is because GPX is already losing. And when I save this GPX as a PNG, let's see the size, okay, size of thePng is this we can, of course optimize it a little bit more with PNG out win. And I am pretty sure there will be still significant difference between PNG version and WebP version. This is a software that I have purchased to optimize my PNG files previously. But it is not anymore necessary, because now we can use WebP format, which is much better format. Okay, so this software is single threaded on a single image, so it is taking some time. It has so many passes. Okay, so the optimized PNG file is 2.53 megabytes and minus 1.52 megabytes over or not this one. Actually, since GPX files are already losing, we shouldn't convert them to WebP, probably we we cannot. We cannot achieve same quality. I wonder if there is an losing but no point of converting GPX into WebP. Let me check that first. Okay, okay, same quality for GPX, I think we need to have some losing compression, probably for GPX compression, we need to use some other methodology. So let's see which which options we can use. Okay, let's see. Okay, so there is version loses near loses int so we can use, near loses. For GPX, I think okay, so which? Which option should we use? I'm not sure I think I will try near loses. Yeah, let's try it with so for that, I'm going to have another file it will be for GPX. For GPX, I'm going to remove, loses and change it with near loses withZero and I think I have to remove z9 as well so yeah I have to remove z9 okay let's try this way for GPX okay and this is the file name okay let's test GPX SR or a GPX and let's comment out this is and let's make it like this yeah okay let's see what kind of results we are going to get with GPX command. Okay so it is done. Oh wow now we have a better result than original GPX so let's compare two images quality of course I am not expecting them to be same. Yeah, I can see the difference there is already some difference, but I am not sure if we have lost some quality or not. Yeah, we have lost some quality, as you can see, definitely and it is not small as well. Okay, I wonder if it is possible to compress GPX losing quality. Is this even possible? I'm not sure compress GPX okay, okay,"]

You can simply assign the output to a string

output = inference.punctuation([example])[0]

and save (append) the string to file, which is very simple in python