snakers4 / silero-models

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
Other
5.01k stars 316 forks source link

Issue getting from silero model tried for text enhancement #201

Closed Kishan-Sahu closed 1 year ago

Kishan-Sahu commented 1 year ago

Issue

File ".release_module.py", line 122, in enhance_text File ".release_module.py", line 101, in enhance_long_textblock File ".release_module.py", line 72, in enhance_textblock File ".release_module.py", line 165, in enhance_tokens IndexError: string index out of range

Details

I added punctuation to the text using Silero models over the PyTorch hub, and everything was going smoothly until the attached text example appeared. I have no idea why this is occurring. I'm using this model to add punctuation to transcripts that I collect from YouTube; some of them have a few missing punctuation marks (supplied by the video author), while others have no punctuation at all (auto-generated by youtube).

Transcript throwing Error

transcript 1: ""Hey there. How's it going everybody in this video? We'll be learning about python Data types and specifically We'll be learning about how to work with textual data and textual data in python are represented with strings So we currently have [opened] our intro pi file that we were working with in the last video Where we just printed out hello world and I'll go ahead and run this so that we can see that down here It does print out hello [world] [now] This line here is using the print function and we're passing this text value into that print function now if we wanted to create a Variable that holds that text value then we could say now I'll just get rid of this comment for now So if I wanted a variable to hold that value then I can just create a variable and we'll call that"

transcript 2: "you're now ready to see how to go one layer of a convolution on your network let's go through the example you've seen in the previous video how to take a 3d volume and convolve it with say two different filters in order to get in this example two different 4x4 outputs so let's say convolving with the first filter gives this first 4x4 output and convolving with this second filter gives a different 4x4 output the final thing to turn this into a convolutional neural net layer is that for each of these we're going to add it bias so this is going to be a real number and what - broadcasting you kind of had the same number - every you know one of these sixteen elements and then apply a non-linearity which for illustration that says there a luna mini arity and this gives you a 4x4 output after applying the bias and the non-linearity and then for this thing at the bottom as well you had some different buyers again this is a real number so you had the same row number - all 16 numbers and then applies some non-linearity that fairly non-linearity and this gives you a different 4x4 output then same as we did before if you take this and stack it up as follows so they end up with a 4 by 4 by 2 output then this computation where you've gone from 6 by 6 by 3 to a 4 by 4 by 4 this is one layer of a convolutional neural network Center mapped is back to one layer of for propagation in the standard neural network when a non convolutional neural network remember that one step afford prot was something like this right z1 equals w1 times a0 a0 was also equal to X right and then plus b1 and he applied the non-linearity to get a 1 so that's G of Z 1" Please review the above transcript that is and let us know what the problem is.

snakers4 commented 1 year ago

Can you try the same text, lowercase w/o any punctuation?

Kishan-Sahu commented 1 year ago

Thank you for your response. I tested the following example and a few more, and it works fine. However, is there any way to avoid eliminating existing punctuation and capital letters?

snakers4 commented 1 year ago

Currently not, you have to store the existing caps / punctuation manually and decide which you like more, the original or new / and or combine them.

Kishan-Sahu commented 1 year ago

okay, thanks, where to make changes if I want the original one?

Kishan-Sahu commented 1 year ago

Also i checked by removing all punctuation and converting it to lowercase but it still not works Transcript : "hey there how's it going everybody in this video we'll be learning about python data types and specifically we'll be learning about how to work with textual data and textual data in python are represented with strings so we currently have [opened] our intro pi file that we were working with in the last video where we just printed out hello world and i'll go ahead and run this so that we can see that down here it does print out hello [world] [now] this line here is using the print function and we're passing this text value into that print function now if we wanted to create a variable that holds that text value then we could say now i'll just get rid of this comment for now so if i wanted a variable to hold that value then i can just create a variable and we'll call that message and we'll set that message variable equal to our text value that we passed in to print and here message is our variable [name] so if you're coming from another [language] then you might be wondering if we need [to] use semicolons or something like that to end each line but python doesn't need that it operates on whitespace alone which makes it a very clean language to work with now by convention our variables are usually all lowercase and if it's multiple words then we separate those with underscore so if instead my variable name was my message then it would be my underscore message so that's just a convention that's commonly used and also these variable names should be as descriptive as possible as to what values they're meant to hold so message is a good variable name here because it's holding our message but if i was instead to call this variable m which is a valid variable name but anyone reading my code now wouldn't know when they see this in variable what value it's supposed to hold so message is a much better variable name in this case so this message variable now holds our text data and our text data is called a string now we can use our variable in place of this string anywhere that we use it in our program so instead [of] printing out hello world directly here i'm instead going to now print out this variable and that should give us the same results so if i run that then you can see down here we still get [the] same result now we can see that i created this string with single quotes now you can also use double quotes so if you're wondering if there is a difference between the single quotes in the double quotes and it really just depends on what text you have in your string so for example if you have a single quote in your string so for example let's create one with a single quote instead of hello world [i] will instead make our string bobby's world now see the problem [here] is that python sees this single quote within our text as being the end of the string and it's not going to know what to do with what comes after that single quote here because it thinks that's where the string is closing so if you run this now then you'll get an error now there's a couple of different ways to [handle] this we could escape this single quote with a backslash so if i escape that single quote and now run this now python knows that this single quote doesn't close out the string and that instead this one should now another way to handle that is [to] instead so i'll take away that xscape character now another way to handle that is instead just use double quotes on the outside of our string any string [that] contains single quotes so if we know that our string contains a single quote then we can instead use double quotes on the outside and then it'll know that that single quote isn't the end of the string so now if we run this then we can see that it still works fine but that doesn't necessarily mean that double quotes are better because it goes both ways if your string contains double quotes in the message then i would use single quotes on the outside now if we needed to create a multi line string and one way we can do this is by using three quotes at the beginning and end of our string and these can also be single or double quotes as [well] so let's go ahead and look at an example so i'll add three quotes to the beginning and end of our string here and now i'll just add [some] text to spans multiple lines here so i'll just say was a good and then hit enter to go to a new line and say cartoon in the 1990s so now if we run [that] and we can see that it printed out [our] string correctly over multiple lines [okay] so let's go back to our simple hello world example so i'm just going to take all of that and replace it with our previous hello world example and now let's go ahead and just run that really quick so we're back to our starting point so we can think of our string as a string of individual characters and we can access these individual characters also so first let's see how we can find how many characters are in our string so to do this we can use the [len] function which stands for length so whenever i print out here if i was instead of printing my message if i was to print out this length function and pass in message and then run this now we're no longer printing out our message we're printing out the length of our message and we can see that it says that the length of our string is 11 and if we counted these up 1 2 3 4 5 6 7 8 9 10 11 then we can see that that's correct and we can access our strings characters individually so to do this we can use the square brackets after our string and pass in the location of the character that we want so i'll say print message and then square brackets now the location is called an index and it starts out at 0 so to access the first character of our string we can say print the message and then access this index of 0 so if we print that then we can see that we got the capital h now since the length of our string is 11 that means that with the first character starting out at 0 our last character would be at index 10 so it's our total length minus 1 so if i was to say print out the location at index 10 the value at index 10 then we can see that we got our d character which is the last character in that string now if you accidentally try to access an index that doesn't exist then you'll get an index error so if we were to say access the index of 11 and we can see that that through an index error now we can also access a range of characters so if i just wanted to get the word hello from our string then we could say that we want 0 and then this colon 5 and we'll explain this so the first index here is our starting point and the second in that index which is separated by this colon is the stopping point now one thing a little confusing about this is that the first index is inclusive which means it's going to include that value but the second index is not now there's good reasons for this but it's still easy to forget so basically what we're saying is i want all the characters between the beginning and up to but not including the fifth index and it will be more clear if we just go ahead and run this so we can see that it prints out hello so it printed out our message from the zero index here all the way up to but not including the [5th] index here so we got hello now since our starting point here is just [the] first index we can actually just leave that off and it will assume that we want to start at the beginning so if we don't put anything there and then : [five] then we should get the same thing so if we run [that] we can see we still got hello now instead if we wanted to grab the word world from the string then we could start at the sixth index and then we can just go to the end and just like leaving off our starting index it will start from the beginning leaving off the stop index just goes all the way to the end so now if we run that and we can see that it gives us back the word world now what we're doing here is called slicing and if you'd like to learn more about slicing in depth then you can watch my detailed video on that and i'll leave a link to [that] in the [description] section below so now let's just go back to printing out our message and let me run this okay so all of the data types that we're going to review are going to have certain methods available to us that give us access to a lot of useful functionality now when i say methods a lot of people wonder what's the difference between a method and a function and functions and methods are basically the same thing [a] method is just a function that belongs to an object it's not important to get into the details of that now but if you hear me say method or function then you can basically think of those as the same thing for now so like i was saying the data types that we'll be going over all have certain methods available to us that give us access to a lot of useful functionality so let's look at some of these string methods so we can see here that our hello world text is capitalized but let's say [that] we wanted that to be all lowercase now to do this it's just as easy as saying print message and then to lowercase this we can say dot lower now when we run this dot lower with these parentheses here that's running the lower method on the string so when we ran that we can see that now our hello world has been set to all lowercase [and] [if] we wanted to do this to uppercase and it's as easy as changing at lower to upper if we run that now we can see that hello world is all uppercase so now let's say that we want to count a certain number of characters in our string and to do that we can use the count method and the count method actually takes a string as an argument so we can say message dot count and now we have to pass in an argument and it has to know what we want to count in our message so what for now we'll [just] say count hello and if we run that we can see that it returns that the string hello appears in our message one time but if we instead just passed in a single character as argument so i'll pass in an l if i run that you can see that we get a 3 because there are three l's and our message variable so if we instead want to find the index of where certain characters can be found and we can use the find method so i could come up here instead of saying dot count i can say not fine and now this takes an argument as well it's what we want to find so let me type in world here and run this and we can see when [we] run this that it returns a six and that's because world starts at the sixth index of our message variable now if we try to find a string of characters that doesn't exist then it will just return negative one so if instead of world i typed in universe and ran that you can see it returns a negative one because it can't find that anywhere in our message variable okay so now let's say that we want to replace some characters in our string with some other characters and we can do this with the replace method now first i'm going to change this change this back to printing out our regular message [so] i'll just delete these now let's try to use our replace method right below where we first set our message fair herbal and this method takes two arguments first it takes what we want to replace so first let me just say message but replace so first it takes what we want to replace so let's say that we want to replace world and now the second argument which is separated by a comma [is] what we want to replace world with so we'll replace world with universe so now if [we] run this and this might be a little unexpected we can see that it's still printing out hello world [now] the reason our replacement didn't work is because it's not making that replacement in place it's actually returning a new string with those values replaced and we'll learn more about return statements when we learn about functions but basically we need to set a new variable here to get that returned string with those replacements so i could say something like new message is equal to that original method with this replacement so now if we set that new variable and instead print that new message and run that and you can see that now it replaced the world with universe and if you really wanted to make the replacement to the message variable then instead of making a new message variable and we can just set this same message variable again so now we're setting the same message variable equal to that replacement and then printing it out so if i run that and we can see that now the message variable had world replaced with universe and this may look a little strange [to] set a variable to an altered [version] of itself but it's very common and you'll be using that a lot okay so now let's get rid of this replace line here and now let's look at how we can add multiple strings and concatenate them together so instead of saying hello world i'm instead just going to set this equal to hello and instead of calling this message i'm going to set this equal to greeting and just below greeting i'm going to create a variable called name and i'm going [to] set that equal to we'll just say michael and now lastly let's create a message here and we want this message to combine our greeting with our name wanted to say hello michael and one way to do this is to use the plus sign operator so we could try this by saying greeting plus name now if we run this and we can see that it's not exactly what we wanted it combined them together but it doesn't have a space there so when you're concatenated strings together it's easy to make mistakes like this so what we want to do is add a string between them so that spaces them out so we can add a string between these just by putting in a string literal here and i'm also going to put in a comma and a space to separate those and now we also need another plus sign so that we're adding the name after that string so now we're saying that we want this greeting which is hello plus this comma and space plus the name so if i run this then we can see that it concatenated those strings together how we wanted now sometimes using the plus sign isn't the best way to go if we wanted to create a longer more complicated string while using our variables and adding them all together like this might get hard to keep track [of] so let's say that we wanted to add to the end of our message just by you know closing off a sentence and saying welcome so to do that after our name variable we could add another string that is a period to close off that sentence a space and then welcome with an exclamation point so let's go ahead and run that now we can see that it printed out the string how we wanted it to look but it's starting to get a little complicated on this line here to keep track of all of our plus signs and spaces within our message instead with strings like this it's usually better to use a formatted string [this] allows [us] to write the sentence as it will appear and put placeholders in place of our variables so let's go ahead and see what this would look like so though instead use a formatted string we can say message is equal to and we're just going to write it exactly how it appears except everywhere where we want to replace with a variable we're going to put in a placeholder and those placeholders are going to be these curly brackets so we want a placeholder for the greeting and then a comma a space and a placeholder for the name [and] a period and then [we'll] type out welcome and an exclamation point and now to fill those placeholders with our variables we can say dot format and pass in greeting for the first placeholder and [then] name for the second place holder so now let's go ahead and delete this line where we where we were using the plus sign operator so if we run this we can see that we got the same [thing] as when we concatenate and then together but with longer more complicated strings this is actually a little bit easier to read it might look a little confusing now since we're just seeing these placeholders for the first time but after you get used to this you will realize that this is easier than keeping track [of] all those concatenations and using [stream] formatting like this also gives us some extra functionality and if you want to see everything [that] you can do with that and i do have a detailed in depth video on string formatting and i'll add that to the description section below ok and lastly if you're using python 3 6 or above then you'll have access to these new things called f strings now if you aren't using [3 6] or higher then these aren't going to work because they were only recently released and not a lot of people are using these f strings yet but i'm really liking them so far basically the idea behind f strings was to make string formatting as simple as possible so let's go ahead and see what these look like so to say that [we] want this to [be] an f string we can just add an f right here to the beginning on the outside of the string now instead of using this dot format method well instead just write the variables directly within the placeholders so [i] can remove that dot format method and now we can pass the variables directly within the placeholder so we'll pass greeting to that placeholder and name to that one so i can save that and run it and you can see that using that we got the same result [and] like i said these f strings are pretty new to the language so even people who are very familiar with python may be seeing this for the first time now one reason i really like these f strings is because you can actually write code within the placeholder so if i wanted the name and all caps for some reason then i could just come [into] the placeholder [here] and say name dot upper and run that directly within the placeholder if i run that then you can see that now our name was capitalized okay so we're almost finished up but i wanted to show you one last trick here when you're learning something new in python so [we] saw a lot [of] different methods that we could use on our strings now if we ever wanted to see everything that's available to us then there are a couple of things that we can do so first we can use this dir function and that looks like this and what this does is if we pass in a variable then it will show us all of the attributes and methods that we have access [to] with [that] variable now don't worry about all of these double underscore methods yet but if we look down here then we'll see a couple of familiar methods that we used in this video [so] for example we have dot upper here we have not replace dot lower so this kind of gives you a broad overview of all the attributes and methods that are [available] to you now this doesn't show what any of these actually do so to see more information about these string methods then we can use the help function but if we run this help function on the name then that won't work we actually need to run it on the string class itself so we'll just type in help and then str for string and if i run that and we can see that this gives us a lot more [information] and i'll pull this up here a bit so it you know tells us everything that's available to us and if i scroll down here [a] bit and if i try to find something that we actually used in this video okay so we have fine here so you can see that actually gives us a description of what these methods do and what arguments they take now if you know that you had a certain method available to you but you couldn't remember exactly what it does then you can actually pass it directly in to help also so if i wanted to find out some more information about the lower method then we could say string dot lower and if we run that and we can see that it gives us information and the description just of what lower does so using that dir function and that help function is a great way to get a broad overview of the methods and attributes that [are] available to you and also what using the help function gives you an idea of what those methods do without actually you know going online and checking everything there okay so i think that is going to do it for this video i hope that now you feel comfortable with working with strings and are familiar [with] some of the useful things that we can do with those in the next video we're learning how to work with numbers in python and specifically integers and floats but if anyone has any [questions] about what we covered in this video then feel free to ask in the comment section below and i'll do my [best] to answer those if you enjoy these tutorials and would like to support them then there are several ways [you] can do that the easiest ways to simply like the video and give it a thumbs up and also it's a huge help to share these videos with anyone you think would find them useful and if you have the means you can contribute through patreon and there's a link to that page in the description section below be sure to subscribe for future videos and thank you all for watching you"

Kishan-Sahu commented 1 year ago

The model runs successfully on the above text after removing all alphanumeric characters.