ome / omero-scripts

Core OMERO Scripts
https://pypi.org/project/omero-scripts/
12 stars 32 forks source link

Populate metadata and BOM #179

Closed pwalczysko closed 3 years ago

pwalczysko commented 3 years ago

The Populate Metadata script fails if the CSV file which is fed into it has a BOM marker. The BOM marker is easily created by following workflow

  1. Open Microsoft Excel. Import (i.e. do not Open) a CSV file. Edit some bits of this file and save it again as CSV (that is how a user would inadvertently obtain a BOM merked file).

  2. You can set or unset the using VIM. (command set nobomb to unset, set bomb to set).

Suggestion: Make the Populate Metadata script robust to handle CSV files with BOM. similarly to https://github.com/ome/omero-metadata/pull/40/commits/44f69e9563f2d2c595030a94e7c51b9e21c9f1a6

cc @sbesson @manics

omero_metadata.populate.MetadataError: 
Column type 'well' unknown.
Choose from following: plate,well,image,dataset,roi,d,l,s,b
joshmoore commented 3 years ago

https://github.com/ome/omero-py/blob/3f68db664839a8f383baff6759d5ef0569863690/src/omero/util/populate_roi.py#L173 would be the place to change if you want to give it a try, @pwalczysko

will-moore commented 3 years ago

Trying to reproduce this, but I can't seem to get Excel to add any bom to a CSV that I can see via cat file.csv or in my editor.

How do you call those commands in VIM? Calling :set bomb doesn't seem to do anything.

Or is there an example file I could use?

sbesson commented 3 years ago

@will-moore using the same principle as the commit mentioned above, the following Python snippet should write an CSV file into a CSV with a BOM:

with open('test.csv', 'r') as f:
  with open('test2.csv', 'w', encoding='utf-8-sig') as f2:
    data = f.read()
    f2.write(data)
pwalczysko commented 3 years ago

Trying to reproduce this, but I can't seem to get Excel to add any bom to a CSV that I can see via cat file.csv or in my editor.

How do you call those commands in VIM? Calling :set bomb doesn't seem to do anything.

Or is there an example file I could use?

The VIM command almost cretainly worked. Yes, the output is underwhelming, but, after :set bomb and hitting Enter, you can verify the action again in VIM, this time asking question :set bomb? and the answer will be bomb.

will-moore commented 3 years ago

So how would I know the bom has been added? I don't see any change using cat test.csv or opening in text editor (VS Code). So is the only way to know it's there to try and use it in the populate metadata workflow? The VIM approach does nothing when I enter :set bomb. It doesn't ask me to confirm.

Screenshot 2021-01-07 at 10 45 04

pwalczysko commented 3 years ago

The VIM approach does nothing when I enter :set bomb. It doesn't ask me to confirm.

As indicated, it will be very silent on output. You just still confirm with Enter. Then, ask it :set bomb? (note the question mark) and then it will answer obmb

(if your action was unsuccessful, than the answer would be nobomb

manics commented 3 years ago

A lot of text editors hide the BOM. You can check it by looking at the binary data, e.g. run od -a -N10 FILENAME to see the first 10 bytes of FILENAME.