jim1001 / DemPower-Issues

Issue tracker for DemPower private repository
0 stars 0 forks source link

Encode Swedish text files #157

Open jim1001 opened 6 years ago

jim1001 commented 6 years ago

Therese - in order to correctly read the .txt files you are uploading to DropBox (with all the Swedish accents) I need to know the encoding you are using to produce them. The fix that I thought may work earlier using

tags works for special characters in English but apparently not for Swedish language accents.

Can we pick one example, say 0.2A.txt. Can you go to the software / editor you used to create this file and check what encoding you have set. It should be in Settings / Options or you may see it in File - Properties. It may be called character set or charset.It is often listed in the bottom status bar of the editor when you have the file loaded.

Thanks,

thebi84 commented 6 years ago

I think you mean this: Unicode UTF-8?

jim1001 commented 6 years ago

OK, thanks - that is best encoding to use. If you produced 0.2A.txt with Unicode UTF-8 then I can do some tests based on that - when I get a moment!

thebi84 commented 6 years ago

Thank you!

Från: jim1001 notifications@github.com<mailto:notifications@github.com> Svara till: jim1001/DemPower-Issues reply@reply.github.com<mailto:reply@reply.github.com> Datum: onsdag 11 april 2018 14:15 Till: jim1001/DemPower-Issues DemPower-Issues@noreply.github.com<mailto:DemPower-Issues@noreply.github.com> Kopia: Therese Bielsten therese.bielsten@liu.se<mailto:therese.bielsten@liu.se>, Assign assign@noreply.github.com<mailto:assign@noreply.github.com> Ämne: Re: [jim1001/DemPower-Issues] Encode Swedish text files (#157)

OK, thanks - that is best encoding to use. If you produced 0.2A.txt with Unicode UTF-8 then I can do some tests based on that - when I get a moment!

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/jim1001/DemPower-Issues/issues/157#issuecomment-380430597, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AiE7tbCGwL9ODHHxOBcxbDWf35pA5aRNks5tnfPcgaJpZM4TM0J9.

jim1001 commented 6 years ago

The problem is that I can't read the accents in the latest text files you've uploaded to DropBox.

If I look at 0.2A.txt on dropbox.com the Swedish accents don’t show correctly.

If I download 0.2A.txt to my PC and open it in a text editor the editor detects it as ANSI encoded. OK - sometimes they guess wrong. But if I tell the editor it is UTF-8 I still don’t see the Swedish accents correctly.

If I look at sv.yml on dropbox.com the Swedish accents do show correctly. My text editor detects it as UTF-8 & all accents show correctly.

Have you always been using the same editor with same settings to produce your text files?

Could you try again to open, save & upload 0.2A.txt and make sure you are definitely saving as UTF-8 encoded? Save it as 0.2A_v2.txt so we can compare.

Thanks.

jim1001 commented 6 years ago

Tried 0.2A_v2.txt you uploaded - can see all accents correctly in browser and in my editor when downloaded. The editor auto-detected it as UTF-8. So all looks good. To make absolutely sure I will import into a test app - will post here when I've done that. Thanks.

thebi84 commented 6 years ago

Good!

Från: jim1001 notifications@github.com<mailto:notifications@github.com> Svara till: jim1001/DemPower-Issues reply@reply.github.com<mailto:reply@reply.github.com> Datum: torsdag 12 april 2018 11:59 Till: jim1001/DemPower-Issues DemPower-Issues@noreply.github.com<mailto:DemPower-Issues@noreply.github.com> Kopia: Therese Bielsten therese.bielsten@liu.se<mailto:therese.bielsten@liu.se>, Assign assign@noreply.github.com<mailto:assign@noreply.github.com> Ämne: Re: [jim1001/DemPower-Issues] Encode Swedish text files (#157)

Tried 0.2A_v2.txt you uploaded - can see all accents correctly in browser and in my editor when downloaded. The editor auto-detected it as UTF-8. So all looks good. To make absolutely sure I will import into a test app - will post here when I've done that. Thanks.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/jim1001/DemPower-Issues/issues/157#issuecomment-380747341, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AiE7tbP3O6hpmIa_ZU4id4lOLWV07sfbks5tnyVsgaJpZM4TM0J9.

jim1001 commented 6 years ago

I've tried 0.2A_v2.txt in a test app & it is OK.

Can you make sure all txt files you produce from now on are saved in Unicode UTF-8.

Files that you already produced with the accents like this <\ä> are OK - no need to re-post, but you don't need to write accents like this any more. As long as they are written & saved as 0.2A_v2.txt we should be OK.

It looks like all files you recently upload to VideoTranscripts need re-saving in UTF-8 & re-uploading.

Also I noticed the four Chapter Introductions C1.txt ...C4.txt have appeared in this folder - don't know why. They were already in the ChapterIntroductions folder.

jim1001 commented 6 years ago

I copied your updated translations for the Introduction videos to their DropBox folder & sync'd with my sv tablet. All accents show fine in DemPower.

jim1001 commented 6 years ago

Copied your updated translations for the Section Introduction videos to their DropBox folder & sync'd with my sv tablet. All look fine in DemPower (except 4.1 problem).

jim1001 commented 6 years ago

Transcript for communication video is done as well.

This is uploaded to DropBox install folder & works in my sv DemPower.

Note to myself: I had to convert Therese's file from UTF-8 to UTF-8 BOM in Notepad ++ for accents to appear correctly on tablet. Other couples video translations from Therese were recognised as UTF-8 BOM & didn't need conversion.

jim1001 commented 6 years ago

4.1A.txt does not appear to be correct transcript for 4.1 Section introduction video

thebi84 commented 6 years ago

Sorry, now it´s correct.

Från: jim1001 notifications@github.com<mailto:notifications@github.com> Svara till: jim1001/DemPower-Issues reply@reply.github.com<mailto:reply@reply.github.com> Datum: tisdag 24 april 2018 11:06 Till: jim1001/DemPower-Issues DemPower-Issues@noreply.github.com<mailto:DemPower-Issues@noreply.github.com> Kopia: Therese Bielsten therese.bielsten@liu.se<mailto:therese.bielsten@liu.se>, Assign assign@noreply.github.com<mailto:assign@noreply.github.com> Ämne: Re: [jim1001/DemPower-Issues] Encode Swedish text files (#157)

4.1A.txthttps://www.dropbox.com/s/gocr96soxzn2f7e/4.1A.txt?dl=0 does not appear to be correct transcript for 4.1 Section introduction video

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/jim1001/DemPower-Issues/issues/157#issuecomment-383859491, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AiE7tYDw35qQ9QQNi3PhaDN3s4b9Lp9Cks5trur5gaJpZM4TM0J9.

thebi84 commented 6 years ago

Ok. The communication video I made is just 20 sec in the app…?

Från: jim1001 notifications@github.com<mailto:notifications@github.com> Svara till: jim1001/DemPower-Issues reply@reply.github.com<mailto:reply@reply.github.com> Datum: måndag 23 april 2018 20:03 Till: jim1001/DemPower-Issues DemPower-Issues@noreply.github.com<mailto:DemPower-Issues@noreply.github.com> Kopia: Therese Bielsten therese.bielsten@liu.se<mailto:therese.bielsten@liu.se>, Assign assign@noreply.github.com<mailto:assign@noreply.github.com> Ämne: Re: [jim1001/DemPower-Issues] Encode Swedish text files (#157)

Transcript for communication video is done as well.

This is uploaded to DropBox install folder & works in my sv DemPower.

Note to myself: I had to convert Therese's file from UTF-8 to UTF-8 BOM in Notepad ++ for accents to appear correctly on tablet. Other couples video translations from Therese were recognised as UTF-8 BOM & didn't need conversion.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/jim1001/DemPower-Issues/issues/157#issuecomment-383668268, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AiE7tSrcFq_4Jb8rucnNNomtX2qMIeX1ks5trhd1gaJpZM4TM0J9.

jim1001 commented 6 years ago

The communication video I made is just 20 sec in the app…?

I must have only done a test. The full version is now uploaded here & works on my tablet.

thebi84 commented 6 years ago

Thank you :)

Från: jim1001 notifications@github.com<mailto:notifications@github.com> Svara till: jim1001/DemPower-Issues reply@reply.github.com<mailto:reply@reply.github.com> Datum: tisdag 24 april 2018 17:00 Till: jim1001/DemPower-Issues DemPower-Issues@noreply.github.com<mailto:DemPower-Issues@noreply.github.com> Kopia: Therese Bielsten therese.bielsten@liu.se<mailto:therese.bielsten@liu.se>, Assign assign@noreply.github.com<mailto:assign@noreply.github.com> Ämne: Re: [jim1001/DemPower-Issues] Encode Swedish text files (#157)

The communication video I made is just 20 sec in the app…?

I must have only done a test. The full version is now uploaded herehttps://www.dropbox.com/s/qq2f5yda4vzfqi3/4.6C1.webm?dl=0 & works on my tablet.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/jim1001/DemPower-Issues/issues/157#issuecomment-383965289, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AiE7ta1sRlpmhMi2oQhwUOMwUODHVrtBks5trz4IgaJpZM4TM0J9.

jim1001 commented 6 years ago

The transcript files can be made to look better by adding formatting tags. See attached. 1.1C1.txt

See the English couples video screens in the app & compare. Compare the corresponding .txt files.

I can automate the process - just a case of when I have time...

thebi84 commented 6 years ago

Ok, will do.

Från: jim1001 notifications@github.com<mailto:notifications@github.com> Svara till: jim1001/DemPower-Issues reply@reply.github.com<mailto:reply@reply.github.com> Datum: torsdag 26 april 2018 16:14 Till: jim1001/DemPower-Issues DemPower-Issues@noreply.github.com<mailto:DemPower-Issues@noreply.github.com> Kopia: Therese Bielsten therese.bielsten@liu.se<mailto:therese.bielsten@liu.se>, Assign assign@noreply.github.com<mailto:assign@noreply.github.com> Ämne: Re: [jim1001/DemPower-Issues] Encode Swedish text files (#157)

The transcript files can be made to look better by adding formatting tags. See attached. 1.1C1.txthttps://github.com/jim1001/DemPower-Issues/files/1951538/1.1C1.txt

See the English couples video screens in the app & compare. Compare the corresponding .txt files.

I can automate the process - just a case of when I have time...

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/jim1001/DemPower-Issues/issues/157#issuecomment-384656143, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AiE7tbQBxGeZ-5ATMfSQIgtsM8psTr0Cks5tsdYqgaJpZM4TM0J9.

jim1001 commented 6 years ago

Don't do it all by hand though as it's a waste of your time. I used Notepad ++ editor where you can do smart Find-Replace operations. I was going to further automate the process for your files & write a program...

thebi84 commented 6 years ago

I don´t want to learn new things anymore….😫 I think I do it by hand.

Från: jim1001 notifications@github.com<mailto:notifications@github.com> Svara till: jim1001/DemPower-Issues reply@reply.github.com<mailto:reply@reply.github.com> Datum: torsdag 26 april 2018 17:33 Till: jim1001/DemPower-Issues DemPower-Issues@noreply.github.com<mailto:DemPower-Issues@noreply.github.com> Kopia: Therese Bielsten therese.bielsten@liu.se<mailto:therese.bielsten@liu.se>, Assign assign@noreply.github.com<mailto:assign@noreply.github.com> Ämne: Re: [jim1001/DemPower-Issues] Encode Swedish text files (#157)

Don't do it all by hand though as it's a waste of your time. I used Notepad ++ editor where you can do smart Find-Replace operations. I was going to further automate the process for your files & write a program...

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/jim1001/DemPower-Issues/issues/157#issuecomment-384685848, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AiE7ta_woXZOplaVQB-czsNLJoab564Cks5tsejIgaJpZM4TM0J9.

jim1001 commented 6 years ago

Therese,

Please don't do any more transcripts until we have tried one and checked all is OK. Thanks, Jim

thebi84 commented 6 years ago

OK!

jim1001 commented 6 years ago

OK - you can now upload 1.1C1.txt & 2.1C1.txt from this folder to your tablet & see your work. If you're happy then continue with the rest or wait for me to automate...

Note to myself: Had to re-save Therese's file to change encoding from UTF-8 to UTF-8 BOM in Notepad ++ (as above)

thebi84 commented 6 years ago

it looks like it does on the text files. Can I view it in dempower with the video? If it looks like it does in the english app it is fine.


Från: jim1001 notifications@github.com Skickat: den 2 maj 2018 11:49:24 Till: jim1001/DemPower-Issues Kopia: Therese Bielsten; Assign Ämne: Re: [jim1001/DemPower-Issues] Encode Swedish text files (#157)

OK - you can now upload 1.1C1.txt & 2.1C1.txt from this folderhttps://www.dropbox.com/sh/nfpcqby6cz5uin8/AADpX-QCizIGEuSN4qTyw37Va?dl=0 to your tablet & see your work. If you're happy then continue with the rest or wait for me to automate...

Note to myself: Had to re-save Therese's file to change encoding from UTF-8 to UTF-8 BOM in Notepad ++ (as above)

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/jim1001/DemPower-Issues/issues/157#issuecomment-385924172, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AiE7tS0O3mp8N7hc3CrEReIozsFmFkAIks5tuYEkgaJpZM4TM0J9.

thebi84 commented 6 years ago

I can view it now, it looks fine.


Från: jim1001 notifications@github.com Skickat: den 2 maj 2018 11:49:24 Till: jim1001/DemPower-Issues Kopia: Therese Bielsten; Assign Ämne: Re: [jim1001/DemPower-Issues] Encode Swedish text files (#157)

OK - you can now upload 1.1C1.txt & 2.1C1.txt from this folderhttps://www.dropbox.com/sh/nfpcqby6cz5uin8/AADpX-QCizIGEuSN4qTyw37Va?dl=0 to your tablet & see your work. If you're happy then continue with the rest or wait for me to automate...

Note to myself: Had to re-save Therese's file to change encoding from UTF-8 to UTF-8 BOM in Notepad ++ (as above)

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/jim1001/DemPower-Issues/issues/157#issuecomment-385924172, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AiE7tS0O3mp8N7hc3CrEReIozsFmFkAIks5tuYEkgaJpZM4TM0J9.

jim1001 commented 6 years ago

All couples transcripts you tagged have been converted & uploaded to DropBox folder. I made a few small corrections. I've checked them on my tablet but good idea for your fresh eyes to check too.

If you are happy with them I suggest we delete the couples files you produced here since these are unconverted & do not have corrections. If we leave them we could get confused in future.

1.2C1 needs some attention - see how it looks on your tablet.

Are you planning to format 4,6C1?

thebi84 commented 6 years ago

Thanks, they look great. I revised the 1.2C1 and 4.6C1

I agree, I Will delete them


Från: jim1001 notifications@github.com Skickat: den 4 maj 2018 11:56:08 Till: jim1001/DemPower-Issues Kopia: Therese Bielsten; Assign Ämne: Re: [jim1001/DemPower-Issues] Encode Swedish text files (#157)

All couples transcripts you tagged have been converted & uploaded to DropBox folderhttps://www.dropbox.com/sh/nfpcqby6cz5uin8/AADpX-QCizIGEuSN4qTyw37Va?dl=0. I made a few small corrections. I've checked them on my tablet but good idea for your fresh eyes to check too.

If you are happy with them I suggest we delete the couples files you produced here https://www.dropbox.com/sh/ozzriho52jpcwan/AADlYqDomOFX8ZjsY9I7o6oYa?dl=0 since these are unconverted & do not have corrections. If we leave them we could get confused in future.

1.2C1 needs some attention - see how it looks on your tablet.

Are you planning to format 4,6C1?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/jim1001/DemPower-Issues/issues/157#issuecomment-386555224, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AiE7tZyOA_I0lLovs7imdvGGtYal7p_Cks5tvCW4gaJpZM4TM0J9.

thebi84 commented 6 years ago

I did leave the revised (1.2C1 & 4.6C1) since I can´t upload on Videotranscripts-couples.


Från: jim1001 notifications@github.com Skickat: den 4 maj 2018 11:56:08 Till: jim1001/DemPower-Issues Kopia: Therese Bielsten; Assign Ämne: Re: [jim1001/DemPower-Issues] Encode Swedish text files (#157)

All couples transcripts you tagged have been converted & uploaded to DropBox folderhttps://www.dropbox.com/sh/nfpcqby6cz5uin8/AADpX-QCizIGEuSN4qTyw37Va?dl=0. I made a few small corrections. I've checked them on my tablet but good idea for your fresh eyes to check too.

If you are happy with them I suggest we delete the couples files you produced here https://www.dropbox.com/sh/ozzriho52jpcwan/AADlYqDomOFX8ZjsY9I7o6oYa?dl=0 since these are unconverted & do not have corrections. If we leave them we could get confused in future.

1.2C1 needs some attention - see how it looks on your tablet.

Are you planning to format 4,6C1?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/jim1001/DemPower-Issues/issues/157#issuecomment-386555224, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AiE7tZyOA_I0lLovs7imdvGGtYal7p_Cks5tvCW4gaJpZM4TM0J9.

jim1001 commented 6 years ago

I did leave the revised (1.2C1 & 4.6C1) since I can´t upload on Videotranscripts-couples.

That's right - by design. I'm the only one with permission to change files under DemPower_app_install since these are final files that are uploaded to tablet.

I see you've put the revised files under in the VideoTranscripts folder which is where you put all the others - so it's consistent. Thanks :-)

Will do them when I can...

jim1001 commented 6 years ago

Therese, you need to save 1.2C1.txt as UTF-8 & upload it again since I can't read it properly (it appears to be ANSI encoded).

Also in both 1.2C1 & 4.6C1 you have many close paragraph tags </p> without matching open paragraph tags <p> so this needs correcting as well. If you need more explanation then please ask. Thanks.

thebi84 commented 6 years ago

Hi Jim

Hope that it is ok now.

Therese


Från: jim1001 notifications@github.com Skickat: den 8 maj 2018 11:06:56 Till: jim1001/DemPower-Issues Kopia: Therese Bielsten; Assign Ämne: Re: [jim1001/DemPower-Issues] Encode Swedish text files (#157)

Therese, you need to save 1.2C1.txt as UTF-8 & upload it again since I can't read it properly (it appears to be ANSI encoded).

Also in both 1.2C1 & 4.6C1 you have many close paragraph tags

without matching open paragraph tags

so this needs correcting as well. If you need more explanation then please ask. Thanks.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://github.com/jim1001/DemPower-Issues/issues/157#issuecomment-387336586, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AiE7tVK4Q8awKMZOF5j30TFO39Dc6auqks5twWAwgaJpZM4TM0J9.

jim1001 commented 6 years ago

Therese, I tried to guess the formatting you wanted from your latest uploads & have inserted the necessary tags. If it's not what you want then you can upload how you want them to look in a Word doc & I'll make it look like that in the app.

1.2C1 & 4.6C1 have been uploaded to the Couples folder.

jim1001 commented 6 years ago

This from 25 days ago is still waiting a fix:

4.1A.txthttps://www.dropbox.com/s/gocr96soxzn2f7e/4.1A.txt?dl=0 does not appear to be correct transcript for 4.1 Section introduction video

thebi84 commented 6 years ago

Text is uploaded to: text - video transcripts but i can´t find the audio file now

jim1001 commented 6 years ago

4.1 Section introduction video with matching audio and transcript is now uploaded to DropBox. Please check it's correct.