Write a mode for typing in the bubble names on the home screen

cesine commented 13 years ago

We are going to auto-generate the stimuli images from the eBAT stimulus book, but we have no way of automatically naming the home screen bubbles for the 30 sections of the BAT for that new language. Instead we would like to offer a "mode" where the user can type in the bubbles for that language. If they dont type in the bubbles, they will simply stay in English by default.

AndreAchim commented 13 years ago

Instead of typing, could the bubble names be replaced through cut and paste? I know that the language BAT pdf file is currently a single image per page, but with Adobe Acrobat Pro, these can be converted to text based. For instance, when we try to highlight a portion, a message pops up asking if we want conversion to be done. The following was cut from the top of the first page of the French BAT.pdf thus transformed: Date de l'examen I do not know if Acrobat Pro would translate from Inuktituk, but for those languages that it will transfrom to text, the pdf version on McGill site could be text based.

cesine commented 13 years ago

I agree, it would be awesome if the eBAT were not pictures, but if someone ran the Adobe Pro on them to convert them to have a text layer. I dont have access to Adobe Pro, could you look into the feasibility of this? This would be useful for many other reasons too...

(I expect it will probably work for ~10 of the languages)

Worst comes to worst, we can have someone simply type the 30 sections in for all the languages into a centralized text file which the app can download and read from. It would probably take 4-5 hours. I was planning on doing it for the 6 languages we selected.

cesine commented 13 years ago

I divided up this issue into a new issue for extracting the text from the pdf:(https://github.com/iLanguage/AndroidBilingualAphasiaTest/issues/19)

And the xml to do the localization which would look like this for each language if we decide to have someone just type them up:

Histoire du bilinguisme Contexte d'apprentissage et d'utilisation du français Langage spontane Désignation Ordres simple et semi-complexes Ordres complexes Discrimination auditive verbale Compréhension de structures syntaxiques Compatibilité sémantique Synonyms Antonyms Jugement d'acceptabilité Acceptabilité sémantique Décision lexicale Séries Fluence verbale Dénomination Constuction de phrases Contraires sémantiques Morphologie Contraires morphologiques Description Calcul mental Compréhension auditive Lectureà haute voix Copie Dictée mots Dictée phrases Lecture silencieuse des mots Lecture silencieuse des phrases Écriture spontanée

cesine commented 13 years ago

i committed two files containing what the localization would look like: https://github.com/iLanguage/AndroidBilingualAphasiaTest/commit/607237043af84e4536d61f00fab7a3cc16417a60

Localizing consists of creating a directory for each language with the two letter language suffix at the end, see this repository for an example: http://code.google.com/p/mytracks/source/browse/MyTracks/res/

AndreAchim commented 13 years ago

I tried the Amharic series. You can temporarily retrieve the files from the link http://www.er.uqam.ca/nobel/r11274/amharic_stimulus_book.zip. The titles in the stimulus books (there are two of them, including one for non literate) do not translate. I nevertheless saved the non literate one because the conversion process rotated the many pages that were scanned at some angle.

In the experimenter books, a few items did not translate from picture to text, but it seems that all test items translated. Before exploring these files to see if what is still in images would create problems, I think you should verify that the text can usefully be copied into Android. When I cut from the 'translated' document and paste into Word, I get characters, but not Amharic characters. It seems that Amharic character sets may be purchased at about 130$, not a very interesting perspective. If foreign characters such as Amharic are readily available under Android and a character based approach is to be developed and if it seems that we could not assume that all test items always translate from picture to text, then extracting information automatically might need some exception handling. If the Amharic characters cannot be displayed, we could think of a foreign character version of the BAT in which text would be replaced by images of each text segment. That may require important modifications, or it may turn out to be simple if someone has already developed routines to supersede text writing routines with image displaying routines.

AndreAchim commented 13 years ago

I copied the following within the issue on GitHub. Not knowing how to terminate an added comment, I clicked on a button that said something like 'Terminate and comment'. It indicated that I terminated the issue. I doubt that this is what I intended to do. Can you fix it or tell me how to fix this if it must be done by me. Also, how are we supposed to terminate a comment (to have it displayed as a posted comment, not one under writing?

André

I tried the Amharic series. You can temporarily retrieve the files from the link http://www.er.uqam.ca/nobel/r11274/amharic_stimulus_book.zip. The titles in the stimulus books (there are two of them, including one for non literate) do not translate. I nevertheless saved the non literate one because the conversion process rotated the many pages that were scanned at some angle.

In the experimenter books, a few items did not translate from picture to text, but it seems that all test items translated. Before exploring these files to see if what is still in images would create problems, I think you should verify that the text can usefully be copied into Android. When I cut from the 'translated' document and paste into Word, I get characters, but not Amharic characters. It seems that Amharic character sets may be purchased at about 130$, not a very interesting perspective. If foreign characters such as Amharic are readily available under Android and a character based approach is to be developed and if it seems that we could not assume that all test items always translate from picture to text, then extracting information automatically might need some exception handling. If the Amharic characters cannot be displayed, we could think of a foreign character version of the BAT in which text would be replaced by images of each text segment. That may require important modifications, or it may turn out to be simple if someone has already developed routines to supersede text writing routines with image displaying routines.

-----Message d'origine----- De : cesine [mailto:reply@reply.github.com] Envoyé : 4 novembre 2011 21:35 À : Achim, André Objet : Re: [AndroidBilingualAphasiaTest] Write a mode for typing in the bubble names on the home screen (#12)

I agree, it would be awesome if the eBAT were not pictures, but if someone ran the Adobe Pro on them to convert them to have a text layer. I dont have access to Adobe Pro, could you look into the feasibility of this? This would be useful for many other reasons too...

(I expect it will probably work for ~10 of the languages)

Worst comes to worst, we can have someone simply type the 30 sections in for all the languages into a centralized text file which the app can download and read from. It would probably take 4-5 hours. I was planning on doing it for the 6 languages we selected.

Reply to this email directly or view it on GitHub: https://github.com/iLanguage/AndroidBilingualAphasiaTest/issues/12#issuecomment-2638396

cesine commented 13 years ago

When I cut from the 'translated' document and paste into Word, I get characters, but not Amharic characters.

OCR (Optical Character Recognition) can only work if it knows what language the document is in. I doubt that Adobe Pro has an OCR package for Amharic, so it is normal that it recognized garbage, rather than Amharic.

It seems that Amharic character sets may be purchased at about 130$, not a very interesting perspective. If foreign characters such as Amharic are readily available under Android and a character based approach is to be developed and if it seems that we could not assume that all test items always translate from picture to text, then extracting information automatically might need some exception handling.

Amharic is included free in Unicode, but I'm sure Amaharic OCR costs at least 130$. I agree, I think it would be far far too hit&miss to try and run OCR on the eBAT to find the subsections. This is a task that must be checked by humans and will take many many hours per eBAT (maybe people who use the eBAT and would like it to be text rather than images).

If the Amharic characters cannot be displayed, we could think of a foreign character version of the BAT in which text would be replaced by images of each text segment. That may require important modifications, or it may turn out to be simple if someone has already developed routines to supersede text writing routines with image displaying routines.

I think this is the right strategy. The routines for drawing text can be easily replaced with drawing images (since it is an HTML5 canvas). This is a very easy solution, and the one that I think we are going to go with.

canvas.context.drawImage(image_element, dx, dy) canvas.context.fillText("The subsection", dx, dy)

BilingualAphasia / AndroidBilingualAphasiaTest

Write a mode for typing in the bubble names on the home screen #12