LibraryCarpentry / lc-data-intro

Library Carpentry: Introduction to Working with Data (Regular Expressions)
https://librarycarpentry.org/lc-data-intro/
Other
29 stars 84 forks source link

Potential revision to initial paragraph (courtesy of @yoyology) #207

Open sharilaster opened 1 year ago

sharilaster commented 1 year ago

The following suggestion will need a PR opened after the lesson is migrated to the workbench platform:

I also feel that the initial paragraph of the lesson is difficult to understand. As it stands, the text reads

Regular expressions are a concept and an implementation used in many different programming environments for sophisticated pattern matching. They are an incredibly powerful tool that can amplify your capacity to find, manage, and transform data and files.

A regular expression, often abbreviated to regex, is a method of using a sequence of characters to define a search to match strings, i.e. “find and replace”-like operations. In computation, a ‘string’ is a contiguous sequence of symbols or values. For example, a word, a date, a set of numbers (e.g., a phone number), or an alphanumeric value (e.g., an identifier). A string could be any length, ranging from empty (zero characters) to one that spans many lines of text (including line break characters). The terms ‘string’ and ‘line’ are sometimes used interchangeably, even when they are not strictly the same thing.

I would recommend the following:

Many different programming environments require a way to match patterns of characters to do things like ensuring that an e-mail address is properly entered into an online form. A common tool for this purpose is regular expressions. Using regular expressions (or regex for short) allows you to amplify your capacity to find, manage, and transform data and files.

A regular expression is a method of using a sequence of characters to define a search to match strings, i.e. “find and replace”-like operations. In computation, a ‘string’ is a contiguous sequence of symbols or values. For example, a word, a date, a set of numbers (e.g., a phone number), or an alphanumeric value (e.g., an identifier). A string could be any length, ranging from empty (zero characters) to one that spans many lines of text (including line break characters). The terms ‘string’ and ‘line’ are sometimes used interchangeably, even when they are not strictly the same thing.

The only change to the second paragraph is to remove the reference to abbreviation, since I've moved that to the first paragraph.

Originally posted by @yoyology in https://github.com/LibraryCarpentry/lc-data-intro/issues/184#issuecomment-1015938853