Closed samedii closed 3 years ago
Oh, hi! I was very stressed when I authored that, so I might have forgotten to submit the pull request. I have no objections on this code being merged. I'll just briefly explain what's happening there and the motivation behind it.
I was trying to compare the PESQ of different codecs buffer-to-buffer, which might not actually be the ideal way to do it but I thought it could be a good idea anyway and just kept getting random segfaults and crashes everywhere. I've checked the code and there was an sneaky exit(1)
with the undescriptive "Processing error!\n"
error message which was very hard to debug.
This code generates a python exception instead, so the user can decide what to do with the information. I gotta add that I've only superficially tested if this affects scores past the error, but it didn't seem like it did from the values I got.
Ps.: I didn't know what exactly was the problem with the pow_of function, so I've maintained the crash behaviour there, but I've added a full "backtrace" so whoever happens to come by with that one will be able to see exactly where it is blowing up from.
Thanks for the contributions!
yes, raising an exception is good. it can give a freedom to users.
It will be better if you give a crash example for everyone interested in this code to check the problem.
Thanks again.
The issue for me is that sometimes the clips do not have much sound. This recreates the issue (with no sound at all in reference):
ref = np.zeros(16000)
deg = np.random.randn(16000)
pesq_(16000, ref, deg, 'wb')
It will be better if you give a crash example for everyone interested in this code to check the problem.
Would you like me to add an example in the readme?
from pesq import PESQError
I think there are some possible improvements like having a separate error for no utterances and frequency error? No utterances should be a common error in a lot of applications and it may even be better to just return np.nan
?
The issue for me is that sometimes the clips do not have much sound.
Pretty much the same problem I've ran into.
it may even be better to just return
np.nan
?
idk, nan's can poison your maths really quickly and be just as hard to debug if you're not keeping track of where they are being generated. I'd rather fail loudly than silently mess someone's numbers.
Great! Thanks a lot for giving the examples!
I didn't encounter this error before, since I always use a clean audio instead of the zero array as the reference signal.
If the reference signal is 0, we can just return the NaN. So the easy solution could be checking the input reference signal before using the C PESQ code.
I'm getting this error for reference signals with just a little bit of sound too. I just chose the completely silent signal because it was easy to show. I can upload a signal that fails if you would like.
So in order to return np.nan
we would have to catch the C PESQ error in python or change the C PESQ code to return NaN instead of 'no utterances' error
Well, if the objective is to avoid the performance hit due to exceptional cases, or better semantics for something that is not really an exceptional case, I'd say the C code should return NaN. It would be a somewhat pointless change otherwise.
Agreed. Do you want to merge this as is first or do you want to try and add this feature too?
I am not familiar with cython but I can try and have a look at it here or in another PR.
I think it's up to @ludlows. The quick and dirty way to change it would be to replace the raise for return np.nan
in the .pyx file.
Hi Everyone,
Thanks for the discussions above.
Since many people are using this pesq
code for evaluating results on big datasets, the performance should be the 1st issue except the precision.
So right now, we may have the following steps to do:
np.nan
as the answer for the python code. How do you think about it? of course, any friend has used this code can provide his opinion.
I don't think all this needs to happen in this PR. The proposed changes do not affect performance or behavior in how it is currently used.
- print the error in the C code if it happens.
Maybe I misunderstood but I don't think raising an exception and also printing at the same time is good practice.
it is better to have a behavior without raising exceptions. However, we should let users know which error happends. So I prefer to print the error name and return `np.nan' only.
I prefer returning np.nan too but printing will make me unable to use this library. I am evaluating millions of sound files. My log would be exploding if we did that
ok. You have convinced me. I prefer to only return np.nan
if any possible error in C code happends.
How do you think about it? of course, any friend has used this code can provide his opinion.
Since you said, everyone can share his opinion here is my opinion:
I would like to have an exception that contains some information about the error,
i.e. Invalid sampling frequency
or No utterances detected
.
As exception class the buildins could be used:
ValueError('Invalid sampling frequency' + str(sampling_frequency))
or RuntimeError('Invalid sampling frequency' + str(sampling_frequency))
FileNotFoundError('No utterances detected' + str(path))
When someone wants to ignore the Exception, he can suppress it in a try except block.When you return instead a np.nan
, you cannot distinguish between the different Error Types.
Also, the code then silently passes (As already mentioned by @Rafagd).
I am also against printing, because it is annoying to suppress the print.
As far as I know is the overhead of a raised Exception not relevant. Is there a reason against an Exception?
I think there are some cases that deserve an exception like Invalid sampling frequency
.
However, the specific case No utterances detected
could maybe return np.nan
or something rather than raise an exception since it is quite common (at least in my use case) and there is not really anything wrong with the input.
I am fine with an exception for everything too. I am not seeing any big performance hits in my code anyway.
TL;DR; exceptions for everything except No utterances detected
Since many people are using this pesq code for evaluating results on big datasets, the performance should be the 1st issue except the precision.
Well, whoever is using the code now is probably is doing a "look before jumping" approach to the problem, as it just flat-out crashes if you don't now. This patch, as-is, would probably not affect them, but it would help any new users to learn they should "look before jumping" (or ignore if they don't care about performance/silence is not common).
I prefer returning np.nan too but printing will make me unable to use this library. I am evaluating millions of sound files. My log would be exploding if we did that
We could probably have a "best of two worlds approach". We can change the return codes to negative numbers, save an error string to a global variable (ugh), and have a pesq_error() that prints the last error we got. And maybe an extra parameter on pesq(A, B, C, suppress_errors=True) to skip the error string generation. We could even add a couple of ifdefs to the C code so people who are super-performance-paranoid can #define out all the error detection at compilation time.
As far as I know is the overhead of a raised Exception not relevant.
From what I've read, exceptions are faster than IF-ELSEs if they follow the happy path. They are way slower if they don't. So it's a trade-off on how often do you think your stuff will trigger the exceptions. There's also a bit of semantics going on here. If we expect some silent data, it's not really an exception, right?
Ps.: We could also decide some arbitrary pesq value for no utterances too, but I'm not sure that makes sense.
ok. Thanks for all the discussions above.
I think I need to create a new dev-branch
for prototyping your ideas.
This can probably be closed now, it should be contained in #22.
I could not use this package without this functionality but I found a fork that had implemented it already. I just checked for conflicts.
This PR allows the user to handle failure with try except.
The author @Rafagd may have some input? I also emailed him.