Closed GoogleCodeExporter closed 9 years ago
Thank you for this patch. Concurrency has been bothering me (and our clients)
quite alot and I've been thinking about the best way to resolve the problem.
You mentioned file locks...
There are related issues, like what'd happen if the original po file gets
updated (through makemessages/xgettext) while someone is editing it.
I'll give your patch a try, though, and write a couple (much needed) test cases.
Original comment by mbonetti
on 15 Jun 2010 at 6:30
Thanks for the response. Specifically, the overwriting I mentioned happens if:
1. A string shows up on both translators' page;
2. Translator A makes a change to the string, and saves;
3. Translator B saves;
4. Translator A's change is lost, even if B did not changes that string.
This is something that might possibly be fixed by detecting if a field has been
changed before updating the .po file. If B did change the field, it's
reasonable to let him overwrite A's change.
Regarding to the .po file being updated elsewhere, I found a bug with my patch:
Rosetta uses sequential index instead of msgids; if a string is removed or
added, the .po file will have a different index from the one Rosetta knows,
which might lead to submitted translations being put in wrong slots. This can
be fixed by using POFile.find(msgid) to get the right entry, but I'm not sure
what's the best way to pass around msgid. One idea is to use a hash of msgid
instead of sequential id, then either iterate through the new .po file to
compare hashes, or keep a hashtable in Rosetta code to get the msgid, or preach
polib to add the hash as a possible lookup field. The other is to get it from
the cached POFile in session, which would have the right order of entries. What
do you think?
Original comment by alass...@gmail.com
on 16 Jun 2010 at 10:23
A colleague of mine mentioned the possibility of using Django's cache framework
in the case of read-only .po files. Because Django cache is per-site, using it
would make handling read-only .po files more similar to the normal case.
Instead of storing the entire file in a session, we can just store a hashtable
of translation entries that are on the user's current page.
Original comment by alass...@gmail.com
on 16 Jun 2010 at 11:18
A hashtable-based solution using sha1 or md5 hashes of the msgid as keys is a
good idea to avoid clashes, but as you mention, there is no easy way to update
the underlying POFile object.
Using the cache framework is tempting, but we cannot relay on the fact that the
user has enabled caching. Also, IIRC locmem is the default setting, which is
thread- and process-safe (i.e. not site-wide), so it'd be equivalent to using
the session anyway.
Original comment by mbonetti
on 16 Jun 2010 at 11:54
I had a go at hashing. Basically I changed the code attaching sequential id to
the entry objects to use a md5 hash instead; the hash is then used to look up
the entry in the reloaded POFile using find(). This is not a proper hashtable
solution - it relies on md5 hashes being unique. In our case, since we're
dealing with natural languages, I think the chance of collision is negligible.
Original comment by alass...@gmail.com
on 30 Jun 2010 at 1:35
Attachments:
Thank you for the patch, I'll give it thorough look as soon as I can get some
free time.
Original comment by mbonetti
on 2 Jul 2010 at 1:40
A quick update to let you know I applied both patches and all the unit tests
run fine. I'll write another couple tests to simulate concurrency related
issues and I'll commit this. Thank you for your help!
Original comment by mbonetti
on 3 Jul 2010 at 1:12
Cheers, this is fixed as of r98.
I slightly changed your approach in that I use md5(msgid) + md5(msgstr) as the
hash for the individual entries.
This has the advantage that if user B submits a translation for a string which
was translated by user A after it was last loaded by B, the hash won't match
(because md5(msgstr) has changed) and we can warn user B by displaying a scary
warning note. I added a couple unit tests to cover these cases.
Thanks again for your support on this.
Original comment by mbonetti
on 4 Jul 2010 at 11:27
Thanks, would you be doing a new release soon?
Original comment by alass...@gmail.com
on 6 Jul 2010 at 2:07
definitely, I was just hoping that I'd get some feedback if anything was wrong
with the trunk...
Original comment by mbonetti
on 6 Jul 2010 at 2:17
Original issue reported on code.google.com by
alass...@gmail.com
on 15 Jun 2010 at 4:58Attachments: