name: in check_samplesheet.py, incorporate more flexible encoding when reading
about: prevent issues in reading .csv with non- utf-8 encoding
Current Behavior
When preparing my sample_sheet.csv, I accidentally used a utf-8-sig encoding instead of the standard utf-8 that is anticipated by python. This then caused my column headers to not pass the check and therefore my run was stopped. Some windows text editors may use utf-8-sig encoding by default.
Steps to Reproduce
# create a file in utf-8-sig encoding
file_in='sample_sheet.csv'
with open(file_in, "w", encoding='utf-8-sig') as fh:
fh.write('sample,assembly,fastq_1,fastq_2,coverage_tab,cov_from_assembly')
# then try to read it with utf-8 encoding
with open(file_in, "r") as fh:
header = fh.readline().strip()
header_cols = [header_col.strip('"') for header_col in header.split(",")]
# output
['\ufeffsample',
'assembly',
'fastq_1',
'fastq_2',
'coverage_tab',
'cov_from_assembly']
Expected Behavior
By changing the reader to incorporate utf-8-sig encoding, we will be able to handle either case (utf-8 and utf-8-sig) without losing any functionality.
Example:
with open(file_in, "r",encoding='utf-8-sig') as fh:
header = fh.readline().strip()
header_cols = [header_col.strip('"') for header_col in header.split(",")]
name: in check_samplesheet.py, incorporate more flexible encoding when reading about: prevent issues in reading .csv with non- utf-8 encoding
Current Behavior
When preparing my sample_sheet.csv, I accidentally used a utf-8-sig encoding instead of the standard utf-8 that is anticipated by python. This then caused my column headers to not pass the check and therefore my run was stopped. Some windows text editors may use utf-8-sig encoding by default.
Steps to Reproduce
Expected Behavior
By changing the reader to incorporate utf-8-sig encoding, we will be able to handle either case (utf-8 and utf-8-sig) without losing any functionality.
Example: