Tatoeba / tatoeba2

Tatoeba is a platform whose purpose is to create a collaborative and open dataset of sentences and their translations.
https://tatoeba.org
GNU Affero General Public License v3.0
679 stars 131 forks source link

Sentences without audio files are showing up in searches limited to has_audio=yes #3049

Open ckjpn opened 1 year ago

ckjpn commented 1 year ago

The bug

Sentences without audio files are showing up in searches limited to has_audio=yes

The search I was using

https://tatoeba.org/en/sentences/search?from=eng&to=ita&sort_reverse=yes&trans_to=ita&trans_filter=exclude&trans_link=direct&has_audio=yes&sort=words

Background

I'm fairly certain this sentence had an audio file at one time, but was removed.

Screenshot

Screen Shot 2023-03-29 at 16 26 15
DJ-Saidez commented 1 year ago

I can confirm that. I had added audio to that one, and my speech was of less-than-ideal quality. I guess it didn't automatically switch to has_audio=no.

ckjpn commented 1 year ago

Here is another search that shows this problem.

shekitten's longest sentences with audio.

https://tatoeba.org/en/sentences/search?from=eng&has_audio=yes&native=&orphans=no&query=&sort=words&sort_reverse=yes&tags=&to=&trans_filter=limit&trans_has_audio=&trans_link=&trans_orphan=&trans_to=&trans_unapproved=&trans_user=&unapproved=no&user=shekitten

The reason is the same as the above, as DJ-Saidez pointed out.

profesyonal commented 4 months ago

This problem seems to be fixed? I don't see sentences without audio on the links provided by CK.

ckjpn commented 3 months ago

Here is a search that will show sentences that shouldn't be in the results.

DJ_Saidez's longest sentences "with audio" search.

https://tatoeba.org/en/sentences/search?from=eng&has_audio=yes&native=&orphans=no&query=&sort=words&sort_reverse=yes&tags=&to=none&trans_filter=limit&trans_has_audio=&trans_link=&trans_orphan=&trans_to=&trans_unapproved=&trans_user=&unapproved=no&user=DJ_Saidez&word_count_max=&word_count_min=1

Note that the reason for this is that DJ_Saidez contributed audio that were uploaded and then later removed when the admins determined these were not appropriate for tatoeba.org.

All, or most, audio connected to his sentences are recordings by me.