adapt-it / adaptit

Related language translation editor
Other
10 stars 5 forks source link

Balsa-XO: ponctuation causes source phrase to show up twice in knowledgebase #46

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. I do not know, if this is XO related or not:
2. I have a chosen a translation of a word (e.g. Priester for priest) 
3. now in the context: priest. (followed by a full-stop), AI does not give
Priester automatically as translation. Opening up the Choose Translation
dialog, Priester is there (as the only tranlsation), I tell it OK, and now
looking in the knowledge base there are two entries:
- priest     and
- priest.

What is the expected output? What do you see instead?
that Priester is entered automatically, and not 2 entries in the knowledge
base.

? Could it be, that I have chosen the wrong setting in AI?

What version of the product are you using? On what operating system?

Please provide any additional information below.

Original issue reported on code.google.com by wolfgang...@gmx.de on 17 Apr 2009 at 8:04

GoogleCodeExporter commented 9 years ago
I am not quite sure, if I described this right.
Here a second try.
I have adapted 'you' already as 'ihr' (as only translation).
Now 'you.' comes in the text, and 'ihr' is not adapted automatically.
Instead it shows 'you.' as source text and asks for a translation for this.
So it must have to do, that the ponctuation mark '.' is considered to be a 
simple
character. I will check the settings:
In Edit/Preferences/Punctuation, the '.' is defined both for the Source and 
Target
language:   . -> .   
So there should not be a problem with punctuation marks defined as characters.

Original comment by wolfgang...@gmx.de on 17 Apr 2009 at 8:32

GoogleCodeExporter commented 9 years ago
Adapt It is designed to not put punctuation along with translations into the 
KB. It
allows you to explicitly Add translations with punctuation to the KB if you do 
so
using the KB Editor (Tools menu). Could the 'you.' and 'priest.' have been 
entered
manually into the KB using the KB Editor? You could try executing the "Restore
Knowledge Base..." command that is on the File menu. You must first close any 
open
document using File > Close. Allow Adapt It to restore your KB from the 
appropriate
documents. Once it has completed, check the KB again to see if there are any
translation words there which have punctuation. If so, let me know, and pack 
your
document (File > Pack Document), and send it to me as an email attachment to
bill_martin@sil.org. Also you could zip the knowledge base file and send it also
attached to the same email. The if your project is called "German to English
adaptations", the KB file will be in that project folder and will be called 
"German
to English adaptations.xml".

Original comment by adaptitbill@gmail.com on 18 Apr 2009 at 3:54

GoogleCodeExporter commented 9 years ago
I did not use the KB editor to enter translations.

Here are two additional tests I did to describe the problem:

1st try:
In the knowledgebase created to 2JN, I deleted the 5-10 Source Phrases with a 
final
fullstop.
Then I loaded 2JN again as new document (this time with a different name, and 
some
words (preceeding a fullstop) were not recognized as being already in the KB.
I typed in as translations (without fullstops).
And checking the KB, there are the source phrases iwth a final fullstop again.

Two interesting cases which I realized.

1. 'you.' in the text: in
- \v9 at the end it chooses the translation for 'you' (euch, ihr) and
- \v 12 in the mid, it chooses the translation for 'you.' (euch)
('you.' had got into the KB already in a previous verse)
2. in the KB there is also the sourcephrase 'lord).'

2. try:
Now I did, what you suggest (in comment 2)
- Close the open document (had only one open)
- File/Restore KB
- yes, yes
- choose the database to the text in consideration
...

This gives the same result, as in the first try: In the KB, there are 5-10 
source
phrases with a final full-stop, also the 2 interesting cases mentioned above.

Original comment by wolfgang...@gmx.de on 20 Apr 2009 at 9:20

GoogleCodeExporter commented 9 years ago
Thanks again Wolfgang. Yes, we have determined that a bug was inadvertently
introduced back in January that allows some punctuation to enter the KB. The 
problem
appears, we think only for source words which have final punctuation (i.e., full
stop), AND are immediately followed by footnote or endnote markers (i.e., \f, 
\f*,
\fe or \fe*). Looking through the work you did in the 2Jn text, I saw that the
problem only happened for source words that were followed by a full stop, and 
which
were immediately followed by one of the aforementioned backslash markers. We are
working on a fix for this which will be in the next release, hopefully by the 
end of
April. Please comment further if you find that this problem occurs in some other
context. Thanks.

Original comment by adaptitbill@gmail.com on 23 Apr 2009 at 4:34

GoogleCodeExporter commented 9 years ago

Original comment by adaptitbill@gmail.com on 24 Apr 2009 at 4:19

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
In 4.1.4 I did, what you said in your mail: 'Adapt It WX version 4.1.4 now 
available'

...
* Fixed a bug that was allowing final punctuation to enter the
Knowledgebase and be stored in adapted documents. If a user's
current knowledge base has wrongly admitted items with
punctuation into fields that should not contain punctuation,
the user can fix the problem by doing the following:
   (1) run Adapt It,
   (2) Choose the project but cancel out at the Document page
       of the wizard so that no document is opened (or select
       File > Close if a document was opened), and
   (3) run the File > Restore Knowledgebase... command,
       selecting all documents that should be used for
       rebuilding the KB.
   (4) If Restore Knowledgebase... command discovered any
       wrongly stored punctuation in the KB and adaptation
       documents, the user is notified and an error log called
       KBRestoreErrorLog.txt is saved in the project folder.
The rebuilt KB will no longer contain any final punctuation.
If any of the documents used to restore the KB contained
incorrectly stored punctuation, that punctuation is removed
from the appropriate fields of those documents and the
documents are re-saved. 
...

started to continue to adpat and found (after restoring the KB again) in the
KBRestoreErrorLog.txt file:

This is the KBRestoreErrorLog.txt file - created Sat, May 02, 11:59, 2009.

During the KB Restore operation, punctuation errors were found and corrected in 
the KB,
and changes were made to the punctuation stored in one or more documents used to
restore the KB.
Please Note the Following:
* You should no longer notice any punctuation in KB entries when viewed with 
the KB
Editor.
* With punctuation purged from the KB Adapt It should handle punctuation in your
documents as you expect.
* You may wish to open the document(s) below in Adapt It and check the 
punctuation
for the items listed.

   In the following document(s) punctuation was removed from non-punctuation fields
(see below):
   ----------------------------------------
   3. Johannes.xml:
      * No changes were made in this file! *
   ----------------------------------------
   Mark: CEV - German.xml:
      * No changes were made in this file! *
   ----------------------------------------
   1. Johannes.xml:
      * No changes were made in this file! *
   ----------------------------------------
   2. Johannes.xml:
      * No changes were made in this file! *
   ----------------------------------------
   1. Petrus.xml:
      "blood." was changed to "blood"
      "9.18-21." was changed to "9.18-21"
      "day." was changed to "day"
      "people." was changed to "people"

End of log.

So, I am not sure, if still words with final punctuation enter into the KB.

Original comment by wolfgang...@gmx.de on 2 May 2009 at 10:13

GoogleCodeExporter commented 9 years ago
Tried this out on the XO-Balsa with AI 4.1.4:
There were source phrases together with punctuation in the KB.
After restoring the KB, all those were gone.
Continuing to adapt a text (3JN), punctuation were not entered in KB any more.

There is one exception I found: At the end of 'end fn' there is the source word
'Lord.' in the text (without apostrophes). And 'Lord.' (without apostrophes)was
entered into the KB.

-> Check this out.

Original comment by wolfgang...@gmx.de on 4 May 2009 at 7:49

GoogleCodeExporter commented 9 years ago
Please use the Pack Document... command on the File menu to pack the 3JN 
document,
and send the packed <name>.aip file to me as an attachment. Send to
bill_martin@sil.org and I will look at the situation you mention where 
punctuation
appears to get into the KB. Thanks.

Original comment by adaptitbill@gmail.com on 5 May 2009 at 2:11

GoogleCodeExporter commented 9 years ago
Tested 3JN again in Balsa-XO in a new project with AI 4.1.4 and this time, the
punctuation problem is gone.
Probably I did not update in Balsa-XO to AI 4.1.4 so the problem occured.

Original comment by wolfgang...@gmx.de on 7 May 2009 at 6:59

eb1 commented 9 years ago

From the last comment, it appears that this issue was addressed by Adapt It 4.1.4.