openwebwork / webwork2

Course management front end for WeBWorK
http://webwork.maa.org/wiki/Main_Page
Other
141 stars 164 forks source link

Update Utils.pm - Fix Trim_spaces to remove UTF-8 BOM (Byte Order Mark) #2305

Closed tm-lcarvalho closed 5 months ago

tm-lcarvalho commented 6 months ago

Always when we have a csv in UTF-8 with BOM we have a error "Wide character in crypt" error message when passwords with certain "wide" characters.

We don't have control over where the files are created to be imported. Normally they use a CSV file and then use Excel to export it in UTF-8(the professors who own this course follow this process). However, this specific file came with a BOM. We use this list to import students into WebWork,, and whenever there's a BOM, we encounter the 'Wide character in crypt' error. I have attached two files, one with a BOM and one without a BOM. Our workflow involves uploading the list to WebWork and import the list: WeBWorK -> TEMP Course Tests -> Instructor Tools -> Classlist Editor -> Import users from what file> -> -> Import

╰─$ hexdump -n 3 -C StudentsList_course_csv_utf-8_w_BOM.csv                                                                                                                                                                                 130 ↵
00000000  ef bb bf                                          |...|
00000003
╰─$ hexdump -n 3 -C StudentsList_course_csv_utf-8_withot_BOM.csv
00000000  32 33 37                                          |237|
00000003

Remove UTF-8 BOM (Byte Order Mark):

Line 832: $in =~ s/^\x{FEFF}//;

This line removes the UTF-8 BOM (represented by the Unicode character U+FEFF) from the beginning of the input string.

StudentsList_course_csv_utf-8_w_BOM.csv StudentsList_course_csv_utf-8_withot_BOM.csv

drgrice1 commented 5 months ago

Note that I do not approve this pull request at this time for the previously mentioned reason in the original pull request for this. The trim_spaces method is not where this should be done. That method is used for other things. Also, it is called the trim_spaces method, not the trim_spaces_and_boms_and_whatever_else_you_may_want_to_trim method. The correct place for this is when the class list file is read to begin with.

pstaabp commented 5 months ago

For documentation sake, this was originally at #2304 and there are relevant comments there.

drgrice1 commented 5 months ago

Closing this in favor of #2323 which fixes this issue in the correct way.

drgrice1 commented 5 months ago

This was accidentally merged instead of being closed, but was reverted from the command line.