programminghistorian / ph-submissions

The repository and website hosting the peer review process for new Programming Historian lessons
http://programminghistorian.github.io/ph-submissions
136 stars 111 forks source link

Corpus Analysis with Voyant Tools (translation from Spanish) #608

Open hawc2 opened 5 months ago

hawc2 commented 5 months ago

Programming Historian in English has received a proposal for a translation from Spanish of "Corpus Analysis with Voyant Tools" by @asmartinez and @zizneroz.

I have circulated this proposal for feedback within the English team. We have considered this proposal for:

We are pleased to have invited @asmartinez and @zizneroz to develop this Proposal into a Submission under the guidance of @giuliataurino as editor.

The Submission package should include:

We ask @asmartinez to share their Submission package with our Publishing team by email, copying in @giuliataurino.

We've agreed a submission date of April. We ask @asmartinez and @zizneroz to contact us if they need to revise this deadline.

When the Submission package is received, our Publishing team will process the new lesson materials, and prepare a Preview of the initial draft. They will post a comment in this Issue to provide the locations of all key files, as well as a link to the Preview where contributors can read the lesson as the draft progresses.

If we have not received the Submission package by April, @giuliataurino will attempt to contact @asmartinez and @zizneroz. If we do not receive any update, this Issue will be closed.

Our dedicated Ombudspersons are Ian Milligan (English), Silvia Gutiérrez De la Torre (español), Hélène Huet (français), and Luis Ferla (português) Please feel free to contact them at any time if you have concerns that you would like addressed by an impartial observer. Contacting the ombudspersons will have no impact on the outcome of any peer review.

charlottejmc commented 2 months ago

Hello @giuliataurino, @asmartinez and @zizneroz,

You can find the key files here:

You can review a preview of the lesson here:


Thank you for your submission @asmartinez and @zizneroz.

{% include figure.html filename="file-name-1.png" alt="Visual description of figure image" caption="Figure 1. Caption text to display" %}

Thank you! ✨

anisa-hawes commented 2 months ago

Hello Alberto @asmartinez and Eime Javier @zizneroz,

What's happening now?

Your lesson has been moved to the next phase of our workflow which is Phase 2: Initial Edit.

In this phase, your editor Editor @giuliataurino will read your lesson, and provide some initial feedback. Giulia will post feedback and suggestions as a comment in this issue, so that you can revise your draft in the following phase (Phase 3: Revision 1).

%%{init: { 'logLevel': 'debug', 'theme': 'dark', 'themeVariables': {
              'cScale0': '#444444', 'cScaleLabel0': '#ffffff',
              'cScale1': '#882b4f', 'cScaleLabel1': '#ffffff',
              'cScale2': '#444444', 'cScaleLabel2': '#ffffff'
       } } }%%
timeline
Section Phase 1 <br> Submission
Who worked on this? : Publishing Assistant (@charlottejmc) 
All  Phase 1 tasks completed? : Yes
Section Phase 2 <br> Initial Edit
Who's working on this? : Editor (@giuliataurino)  
Expected completion date? : July 14
Section Phase 3 <br> Revision 1
Who's responsible? : Authors (@asmartinez + zizneroz) 
Expected timeframe? : ~30 days after feedback is received

Note: The Mermaid diagram above may not render on GitHub mobile. Please check in via desktop when you have a moment.

anisa-hawes commented 2 months ago

Hello again Alberto @asmartinez and Eime Javier @zizneroz,

I've sent you each an invitation to join us as Outside Collaborators here on GitHub. This will give you the 'write access' you'll need to add your figure captions and alt-text as Charlotte suggests above. Your lesson file is here: /en/drafts/translations/corpus-analysis-voyant-tools.md, and you can edit it directly.

The original lesson was published before we introduced a requirement for alt-text to support all figures, so we will need to ask you to write the visual descriptions of each figure. However, you can translate / or adapt the captions from the Spanish original as follows:

  1. caption="Guardar en UTF-8 en Windows: 1) Abrir Bloc de Notas, 2) Después de pegar o escribir el texto, dar clic en 'Guardar como' 3) En la ventana de 'codificiación' seleccionar 'UTF-8' 4) Elegir nombre de archivo y guardar como .txt (Torresblanca, 2014)"
  2. caption="Guardar en UTF-8 en Mac: 1) Abrir TextEdit 2) Pegar el texto que se desea guardar 3) Convertir a texto plano (opcin en el menú de 'Formato') 4) Al guardar, seleccionar el encoding 'UTF-8' (Creative Corner, 2016)"
  3. caption="Guardar en UTF-8 en Ubuntu: 1) Abrir Gedit 2) Después de pegar el texto, al guardar, seleccionar 'UTF-8' en la ventana de 'Codificación de caracteres'"
  4. caption="Cargar documentos"
  5. caption="Cirrus"
  6. caption="Lector"
  7. caption="Tendencias"
  8. caption="Sumario"
  9. caption="Contextos"
  10. caption="Abrir opciones"
  11. caption="Editar lista"
  12. caption="Quitar palabras vacías"
  13. caption="Frecuencia relativa"
  14. caption="Asimetría estadística"
  15. `caption="Agregar columna de posición"
  16. caption="Exportar contextos"
  17. caption="Importar datos desde un archivo de textos"

With many thanks, Anisa

giuliataurino commented 2 months ago
  • I have made some adjustments to remove a number of subheadings from the Table of Contents, which was originally longer than we prefer, and felt a little too cluttered. The subheadings still appear in the lesson, but usually as bold or - bullet points, rather than ## Markdown Headings.

Hi @charlottejmc,

Thank you for uploading the translation and making the necessary edits. Should I then direct reviewers to ignore the different in the subheadings titles and structure and focus on the content of the paragraphs only?

Best,

Giulia

giuliataurino commented 2 months ago

Hi Alberto @asmartinez and Eime Javier @zizneroz,

Thank you for taking upon this translation and the additional work required for adapting the examples and new formatting guidelines.

Overall, the translation looks good. In the next few points, I will share a lot of notes and suggestions that are fairly common edits in first draft translations. Bear in mind that not all changes are needed some are questions or suggestions, some of them are reminder of what @anisa-hawes and @charlottejmc already pointed at in previous comments, and few might be notes for the editorial team itself (Anisa and Charlotte, I'll let you make the call on which edits should be taken upon by the editorial team). Despite the number of comments below, I want to reiterate that the quality of the translation is good and we much appreciated your work to make this original submission accessible to non-Spanish speakers.

Introduction

Creating a Plain Text Corpus

Loading the Corpus

Exploring the corpus

Document summary: basic characteristics of your set of text

Cirrus and summary: frequencies and stop word filters

Terms

  • [ ] p. 79: consider transforming the phrase from passive to active for a smoother reading in English, such as "The following section will explore...";
  • [x] p. 80: please note that the phrase from the song translated from Spanish to English counts 7 (not 8) words. Either change the example itself with an English song (cf. p. 48) or translate the sentence by updating the correct word count to 7;
  • [x] p. 81: translate the word "corazón" to English ("heart") or replace with the English word based on the example given in p. 48 + update numbers and calculations to the correct ones based on the example in p. 48 (cf. p. 80, if the translation from the Spanish example is kept, then divide by 7, not 8; if not, update the word count based on the example you choose to provide);
  • [x] p. 81: the punctuation seems incorrect to me in a few instances of the original version. Here, I would replace ";" with a comma "," - as in, "xxxx occurrences out of xxxx words, while the gross frequency";
  • [x] p. 82-83: add bullet list? (this might be a question for @anisa-hawes and @charlottejmc, was this changed on purpose from the original for editorial reasons?) [Publishing Team]
  • [x] p. 83: the original version is "Contar (con la ‘frecuencia bruta o neta’ de cada término)", whereas the translation only mentions "gross frequency". Was there an error in the original version or the translation missed the term "net"? If the latter is true, then the translation should probably be "Count (with the ‘gross or net frequency’ of each term)";
  • [x] p. 85: adapt the translation to consider the new English corpus (which does not seem to include different countries);
  • [x] p. 88/Figure 14: might be worth adding a reference to the median value in the visualization, just like in the original image;
  • [x] p. 89: make sure this example is still giving the same skewness in the adapted translation;
  • [x] p. 92-93: missing space between the lines in the bullet list; [Publishing Team]
  • [x] p. 94: missing all links and missing the spaces between the lines in the bullet list. [Publishing Team]

Words in context

  • [x] p. 96: "left and right concordance" is not translated - "en la ventana de “Contextos” es posible hacer consultas de las concordancias izquierdas y derechas de términos específicos" should perhaps be "in the “Contexts” window, it is possible to make left and right concordance queries of specific terms";
  • [x] p. 100: missing translation in the table section for "Consulta avanzada" + suffix “ción” needs to be adapted in the translation to "th" + add adapted example for "country precarious" (cf. original Spanish version's example: "esa condición regresaría frases cómo “la extrema desigualdad y la pobreza” donde se encuentra la palabra “pobreza” y “extrema”");
  • [x] + missing table formatting -[Publishing Team]
  • [x] p. 101: add numbered list as in original + translate "Cuidado" to "Attention" in the table unless this is a change by the editorial team
  • [x] (@anisa-hawes and @charlottejmc, could you also verify that the table formatting is consistent in all sections? this formatting is different from other formatting both in the translation and in the original.). [Publishing Team]

Activity Answers

  • [x] p. 108 to 115: adapt and update translation to English version (both the "unique words" - tengo, hambre, sueño - and all answers below that);
  • [x] p. 117: add references and links to earlier sections as in the original version.

I remain available should you have any questions or doubts about these reviews.

Best,

Giulia

zizneroz commented 2 months ago

Hi Giulia (@giuliataurino),

Can you send me the Git collaboration invite again? The last one expired because I couldn't find it in my emails.

Best,

Eime

anisa-hawes commented 2 months ago

Hello Eime @zizneroz.

I already re-sent you the invitation to join us as an Outside Collaborator.

We ask authors to work on their own files with direct commits: we don't want you to use the Pull Request system, or fork our repo to edit in ph-submissions. You can make direct commits to your file here: en/drafts/translations/corpus-analysis-voyant-tools.md.

With thanks, Anisa.

zizneroz commented 2 months ago

Hi Anisa @anisa-hawes,

Got it.

I accepted the invite and I'll make the commits directly to the file.

Thank you!

charlottejmc commented 1 month ago

Hello @asmartinez, @zizneroz and @giuliataurino,

I just wanted to point out that I have edited Giulia's comment slightly to turn each point into a checkbox. This should help us keep track of which changes have been made, and which ones still need to be looked at!

You'll see that I have also added [Publishing Team] to several points – these are things @anisa-hawes and I will take care of ourselves.

Thank you!

charlottejmc commented 1 month ago

Hello again @zizneroz,

I saw from your commit that you have been working on the figure alt-text. Thank you! However, I think we will still need some more work to make them truly effective as 'alt-text'.

This descriptive element enables screen readers to read the information conveyed in the images for people with visual impairments, different learning abilities, or who cannot otherwise view them, for example due to a slow internet connection. It's important to say that alt-text should go further than repeating the figure captions.

We have found Amy Cesal's guide to Writing Alt Text for Data Visualization useful. This guide advises that alt-text for graphs and data visualisations should consist of the following:

alt="[Chart type] of [data type] where [reason for including chart]"

What Amy Cesal's guide achieves is prompting an author to reflect on their reasons for including the graph or visualisation. What idea does this support? What can a reader learn or understand from this visual?

The Graphs section of Diagram Center's guidance is also useful. Some key points (relevant to all graph types) we can take away from it are:

  • Briefly describe the graph and give a summary if one is immediately apparent
  • Provide any titles and axis labels
  • It is not necessary to describe the visual attributes of the graph (colour, shading, line-style etc.) unless there is an explicit need
  • Often, data shown in a graph can be converted into accessible tables

For general images, Harvard's guidance notes some useful ideas. A key point is to keep descriptions simple, and adapt them to the context and purpose for which the image is being included.

Would you feel comfortable making a second draft of the alt-text for each of the figures? This is certainly a bit time-consuming, but we believe it is very worthwhile in terms of making your translation accessible to the broadest possible audience. We would be very grateful for your support with this.

Thank you ✨

anisa-hawes commented 1 month ago

Hello Alberto @asmartinez and Eime Javier @zizneroz,,

What's happening now?

Your lesson has been moved to the next phase of our workflow which is Phase 3: Revision 1.

This phase is an opportunity for you to revise your draft in response to @giuliataurino's initial feedback. I've checked to ensure that you both have the 'write access' you'll need to edit your draft directly.

Please continue to make direct commits to your file here: en/drafts/translations/corpus-analysis-voyant-tools.md. @charlottejmc and I can help if you encounter any practical problems!

When both of you & Giuila are all happy with the revised draft, we will move forward to Phase 4: Open Peer Review.

%%{init: { 'logLevel': 'debug', 'theme': 'dark', 'themeVariables': {
              'cScale0': '#444444', 'cScaleLabel0': '#ffffff',
              'cScale1': '#882b4f', 'cScaleLabel1': '#ffffff',
              'cScale2': '#444444', 'cScaleLabel2': '#ffffff'
       } } }%%
timeline
Section Phase 2 <br> Initial Edit
Who worked on this? : Editor (@giuliataurino) 
All  Phase 2 tasks completed? : Yes
Section Phase 3 <br> Revision 1
Who's working on this? : Authors (@asmartinez + @zizneroz)  
Expected completion date? : July 28
Section Phase 4 <br> Open Peer Review
Who's responsible? : Reviewers (TBC) 
Expected timeframe? : ~60 days after request is accepted

Note: The Mermaid diagram above may not render on GitHub mobile. Please check in via desktop when you have a moment.

giuliataurino commented 1 month ago

Hi @asmartinez and @zizneroz,

As we approach the round of revisions, I wanted to check in with you to make sure that you have implemented the initial edits. If so, I will add the two reviewers to this issue so that they can start reviewing the translation.

For everybody's reference, I will be out of the office the first two weeks of August.

Thank you all for your work! Best,

Giulia

asmartinez commented 1 month ago

Hello Gulia,

I spoke with Javier and he told me that he is still making some adjustments and expects to have everything done by the 28th. Does this work for you all?

Get Outlook for Androidhttps://aka.ms/AAb9ysg


From: Giulia Taurino @.> Sent: Tuesday, July 23, 2024 10:56:04 AM To: programminghistorian/ph-submissions @.> Cc: Alberto Santiago Martínez @.>; Mention @.> Subject: Re: [programminghistorian/ph-submissions] Corpus Analysis with Voyant Tools (translation from Spanish) (Issue #608)

Hi @asmartinezhttps://github.com/asmartinez and @ziznerozhttps://github.com/zizneroz,

As we approach the round of revisions, I wanted to check in with you to make sure that you have implemented the initial edits. If so, I will add the two reviewers to this issue so that they can start reviewing the translation.

For everybody's reference, I will be out of the office the first two weeks of August.

Thank you all for your work! Best,

Giulia

— Reply to this email directly, view it on GitHubhttps://github.com/programminghistorian/ph-submissions/issues/608#issuecomment-2245759081, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AFM564O32LQIGPH7KMMOH43ZN2DJ3AVCNFSM6AAAAABE527MQSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBVG42TSMBYGE. You are receiving this because you were mentioned.Message ID: @.***>

anisa-hawes commented 4 weeks ago

Thank you, Alberto @asmartinez!

It would be wonderful if, as you and Javier @zizneroz finalise your Phase 3 revisions, you could go through Charlotte's and Giulia's checklists of tasks/questions and check off those you have resolved.

This will be a great help to @giuliataurino when it comes to re-reading your draft, and confirming if it is ready to move onwards to Phase 4 Open Peer Review.

Please let us know if you have any questions. We are here to help 👐🏼

zizneroz commented 3 weeks ago

Hi Anisa (@anisa-hawes), Giulia (@giuliataurino) and Charlotte (@charlottejmc)

We’ve finished making the changes based on the feedback we received and marked the completed ones in the checklist.

We just have one question about this comment: Exploring the corpus p. 32: unfortunately, in the screenshot the term chosen as an example is cut off. Add a new screenshot where the term is visible and readable. @zizneroz, please send this new image to Charlotte at publishing.assistant[@]programminghistorian.org

We’re not sure which screenshot this refers to since there are several in that section. Also, no term is actually selected. The section is just meant to show the parts of Voyant Tools.

anisa-hawes commented 3 weeks ago

Thank you for your work on these revisions, Javier @zizneroz!

I'd like to ask if you could also revisit Charlotte's checklist earlier in this thread, and check off the tasks/questions you've resolved there. In particular, I notice that you haven't yet developed the alt-text to accompanies each figure - which remains a repetition of the captions. We need to ensure this lesson will be accessible to those who are visually impaired and use screen-readers, so the alt-text must describe what the figures demonstrate. Charlotte has shared some advice and links to resources in her comment above.

Hello Giulia, @giuliataurino - Could you confirm which figure you felt was problematic/cropped incorrectly within the section titled Exploring the Corpus? (I wonder if it might have been Figure 8., where the words Average Word Per Sentence are partially obscured within the summary?)

zizneroz commented 1 week ago

Hi!, Anisa (@anisa-hawes)

Thanks for the feedback! I've gone ahead and updated the alt-texts to be more descriptive, so they better explain what each figure is showing. I also went through Charlotte's checklist and checked off the tasks that are done. Let me know if there's anything else that needs tweaking or if I missed something!

giuliataurino commented 4 days ago

Hi @anisa-hawes,

I believe it was figure 9 (where the term column doesn't show the term), but I do see a sentence cropped in figure 8 as well.

Thank you for your work on these revisions, Javier @zizneroz!

I'd like to ask if you could also revisit Charlotte's checklist earlier in this thread, and check off the tasks/questions you've resolved there. In particular, I notice that you haven't yet developed the alt-text to accompanies each figure - which remains a repetition of the captions. We need to ensure this lesson will be accessible to those who are visually impaired and use screen-readers, so the alt-text must describe what the figures demonstrate. Charlotte has shared some advice and links to resources in her comment above.

Hello Giulia, @giuliataurino - Could you confirm which figure you felt was problematic/cropped incorrectly within the section titled Exploring the Corpus? (I wonder if it might have been Figure 8., where the words Average Word Per Sentence are partially obscured within the summary?)

giuliataurino commented 4 days ago

Hi @zizneroz and @asmartinez,

Hope you are well!

Now that you are almost done with Phase 3 - Revision 1, I'd like to introduce you to the two reviewers, @marisolam and @betovargas - welcome! -, who kindly accepted to revise your translation.

Let us know when the submission is correctly edited and ready for the open peer review so that we can officially start Phase 4 of the editing process.

I remain available should you have further questions.

Best,

Giulia