Closed Jibec closed 5 years ago
When digging a little bit:
here is something weird with: translations/source/*/dictionaries/ it contains inside it a group of file that looks like:
[
{
'filemask':'*.po',
'file_format':'po'
} [
meta:{
'discovery':'GettextDiscovery',
'origin':None,
'priority':1000
}
],
{
'filemask':'*/dialog.po',
'file_format':'po'
} [
meta:{
'discovery':'GettextDiscovery',
'origin':None,
'priority':1000
}
],
{
'filemask':'*/dialog/registry/data/org/openoffice/Office.po',
'file_format':'po'
} [
meta:{
'discovery':'GettextDiscovery',
'origin':None,
'priority':1000
}
],
{
'filemask':'*.po',
'template':'en.po',
'file_format':'po-mono'
} [
meta:{
'discovery':'GettextDiscovery',
'origin':None,
'priority':1000
}
],
{
'filemask':'*/dialog.po',
'template':'en/dialog.po',
'file_format':'po-mono'
} [
meta:{
'discovery':'GettextDiscovery',
'origin':None,
'priority':1000
}
],
{
'filemask':'*/dialog/registry/data/org/openoffice/Office.po',
'template':'en/dialog/registry/data/org/openoffice/Office.po',
'file_format':'po-mono'
} [
meta:{
'discovery':'GettextDiscovery',
'origin':None,
'priority':1000
}
]
]
for your information, running discover
with 20 languages takes 33 seconds (everything is in a tmpfs/RAM)
The problem I see there is https://github.com/LibreOffice/translations/tree/master/source/ab/dictionaries - it's locale specific dir containing files named based on locale. I don't see a good way to automatically decide which is actual language code and which not.
As for performance, this library was not considered to be performance critical, so there might be ways to make it faster. In Weblate it's executed just once at repository import and the slow thing in this case is cloning the repository.
I've made some performance improvements now, going further would need too much effort for now.
That's fine to me, thank you! As I run it on many packages, it's always good to have a little attention to performance :)
Please don't miss the few false positive.
I've commented on the false positive above: https://github.com/WeblateOrg/translation-finder/issues/14#issuecomment-488223034
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
thank you Michal :)
Hello,
while working on this: https://pagure.io/fedora-localization-statistics/
the translation finder will never bring back results on a package like libreoffice.
The translation are inside a dedicated archive http://download.documentfoundation.org/libreoffice/src/6.2.2/
Here is how the file hierarchy looks like, (I kept only the three first folders of languages "ab" and "af"):
Expected result is should be:
But this is the actual result (when I ran this, I kept like the 10 first languages)
All translation makes 1.4 Gb...