datacarpentry / openrefine-socialsci

OpenRefine for Social Science Data
https://datacarpentry.org/openrefine-socialsci/
Other
23 stars 46 forks source link

Awkward wording Trim Leading and Trailing Whitespace #137

Closed ACDMertin closed 1 year ago

ACDMertin commented 1 year ago

The opening sentence of the "Trim Leading and Trailing Whitespace" section is a little awkward:

Words with spaces at the beginning or end are particularly hard for we humans to tell from strings without, but the blank characters will make a difference to the computer.

I had to read it a few times to make sense of it while preparing my teaching demo. Maybe the below wording would be helpful to still get idea across a bit more directly:

Sometimes spaces will be present at the beginning or end of a word. While we can't often see or notice these (especially if they are at the end of a word), this can cause an issue for a computer. So we want to remove these.

Then continue as it currently written.

Happy to make this change if deemed suitable, but I'm doing this as part of my Carpentries check out and am new to Git :)

Thank you.

Instructions Thanks for contributing! :heart: If this contribution is for instructor training, please email the link to this contribution to checkout@carpentries.org so we can record your progress. You've completed your contribution step for instructor checkout by submitting this contribution! If this issue is about a specific episode within a lesson, please provide its link or filename. Keep in mind that **lesson maintainers are volunteers** and it may take them some time to respond to your contribution. Although not all contributions can be incorporated into the lesson materials, we appreciate your time and effort to improve the curriculum. If you have any questions about the lesson maintenance process or would like to volunteer your time as a contribution reviewer, please contact The Carpentries Team at team@carpentries.org. You may delete these instructions from your comment. \- The Carpentries
bencomp commented 1 year ago

That is a good suggestion, @ACDMertin, thanks! It is easy to overlook awkward wording when you get more experienced, so your fresh look is very welcome 😄

If you could make this suggestion into a pull request, I will happily accept it. If you don't feel comfortable doing this, please let me know and I can create the change.

bencomp commented 1 year ago

I created #138 to address this issue. The new text reads:

Sometimes spaces (or tabs, or newline characters) will be present at the beginning or end of a text cell. They may have been in the dataset that was imported, or appear when you perform operations on the data, such as splitting text. While we as humans cannot always see or notice these (especially if they are at the end of a word), a computer always sees them. We often see these spaces as unwanted variations and therefore remove them.

As of version 3.4, OpenRefine provides the option to trim (i.e. remove) leading and trailing whitespace during the import of data (see image at the top of this page). This is then applied to the data in all columns.

OpenRefine also provides a menu option to remove blank characters from the beginning and end of any entries in the column that you choose.

What do you think?