WeblateOrg / weblate

Web based localization tool with tight version control integration.
https://weblate.org/
GNU General Public License v3.0
4.56k stars 1.01k forks source link

Weblate scanning has no signs of progress #7250

Closed pontaoski closed 2 years ago

pontaoski commented 2 years ago

Describe the issue

When importing large repositories into Weblate, the scanning period takes a huge time without any indication of what exactly it's doing or how long it'll take.

I already tried

Steps to reproduce the behavior

  1. Go to create component
  2. Import from VCS
  3. Input the URL of a large repository (the one I'm importing is about 150k files)
  4. Press continue
  5. It hangs; no signs of life to the user

Expected behavior

  1. Go to create component
  2. Import from VCS
  3. Import URL of a large repository
  4. Press continue
  5. It tells me what it's doing and how long it think it will take

Screenshots

No response

Exception traceback

No response

How do you run Weblate?

PyPI module

Weblate versions

Weblate deploy checks

No response

Additional context

No response

tomkolp commented 2 years ago

I always use the console and docker logs during import. There is information about the progress of file processing. Unfortunately, I do not always have access to this console remotely.

nijel commented 2 years ago

The component creation has log visible in the application. The repository scanning merely consists of git clone...

nijel commented 2 years ago

Related to https://github.com/WeblateOrg/weblate/issues/7251

nijel commented 2 years ago

To figure out what is really the expensive operation, you can try it without Weblate:

  1. Get default branch (unless you specify it): git ls-remote --symref repo:url HEAD
  2. Clone the repository, git clone --depth 1 --branch repo:branch repo:url repo:destination
  3. Find translation files using translation-finder: translation-finder repo:destination

But with ~150k files, my guess would be as well that the translation-finder is the bottleneck here and https://github.com/WeblateOrg/weblate/issues/7251 could address this.

nijel commented 2 years ago

I've looked at the translation-finder and there is a lot of space to improve the performance there. https://github.com/WeblateOrg/translation-finder/commit/510ef7a2664d400b3f650a089cfbb3d6a051fdc2 should remove ~300k syscalls in your case.

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because there wasn’t any recent activity.

It will be closed soon if no further action occurs.

Thank you for your contributions!