DEpt-metagenom / MetagenoMongo

1 stars 2 forks source link

Improve error message clarity #37

Open haalasz opened 1 month ago

haalasz commented 1 month ago

The current error messages in the app are often too vague, making it challenging for average users to understand and correct issues. For example, an error like Invalid value in row 1, column 'run_date': Expected data type: date lacks detail on acceptable formats.

Enhance error messages to be more user-friendly by providing clear, actionable information. For instance, instead of just indicating a data type issue, the message should specify the acceptable formats (e.g., YYYY-MM-DD, YYYY-MM, YYYY for date fields) and any other relevant details for each column.

gpetho commented 1 month ago

This is a good suggestion, but please specify what messages you would like to see instead, we will replace them in the code.

haalasz commented 1 month ago

This is a good suggestion, but please specify what messages you would like to see instead, we will replace them in the code.

The user needs to know the following:

All this in a visually pleasing way.

haalasz commented 1 month ago

This is a good suggestion, but please specify what messages you would like to see instead, we will replace them in the code.

The user needs to know the following:

* the location of the error (row number, column name)

* the reason behind the error (data type is not matching, mandatory cell is left empty etc.)

* possible solution (accepted date formats, accepted data types, fixed options in some columns etc.)

All this in a visually pleasing way.

This could work well with issue #40 .

gpetho commented 1 month ago

This is a good suggestion, but please specify what messages you would like to see instead, we will replace them in the code.

The user needs to know the following:

* the location of the error (row number, column name)

* the reason behind the error (data type is not matching, mandatory cell is left empty etc.)

* possible solution (accepted date formats, accepted data types, fixed options in some columns etc.)

All this in a visually pleasing way.

It certainly helps that these general rules are spelled out clearly, but what I also was trying to ask you was to specify the second and third point for each relevant error message, i.e. tell us:

  1. under exactly what conditions
  2. exactly what error message should be displayed to the user, i.e. what is "the reason behind the error" phrased in the way that you want to see it displayed in the application,
  3. and what suggested solution you want to see displayed.
  4. Also mention what the currently displayed message is that you don't like so it's easier for us to locate in the code.

All of these one by one for each error message that you are not currently happy with. All error messages will display row name and column name, so you don't need to bother about mentioning that these pieces of information should be added. If there is an error message that already exists but doesn't behave as it should, like we saw in the case of run_date where we got a date format message although no date was specified in the first place, please note this as well. If a current error message is too general and should be split up into two or more specific cases, make a note of that as well.

I'm not sure what to do about problems that are not checked by the application currently. I think it's probably better to open a new issue to indicate that you would like that feature to be added, but in borderline cases just mention them here. It doesn't matter that much, what's important is that you specify what needs to be added somewhere.

@haalasz @CsongorFreytag This applies to both of you, as well as anybody else who is using the application to validate their metadata and notices that the error messages are not as informative as they should be.

haalasz commented 1 month ago

First of all, we would like to see the error messages line by line, because the current implementation is not good.

["Number of cells with invalid date: 0\nSampleID is necessary in row 1, column 'sampleID':\nSampleID is necessary in row 2, column 'sampleID':\nSampleID is necessary in row 3, column 'sampleID':\nSampleID is necessary in row 4, column 'sampleID':\nSampleID is necessary in row 5, column 'sampleID':\nSampleID is necessary in row 6, column 'sampleID':"] []

Second, the usage of Number of cells with invalid date: 0 error message is not necessary, it should be removed.

Third, the error message itself, where I see five different causes that cannot be fixed automatically and require user intervention.

  1. Mandatory cell left empty
  2. The date format is wrong
  3. The cell contains a special character where it should not be
  4. The cell contains a value other than the built-in list for the column (see .metagenomongo.csv)
  5. The cell contains a value with the wrong data type

1 - If mandatory cell left empty

2 - The date format is wrong

3 - The cell contains a special character where it should not be

4 - The cell contains a value other than the built-in list for the column (see .metagenomongo.csv)

5 - The cell contains a value with the wrong data type

In some cases (1, 4 and 5) the messages are already there, they just need some formatting. In case 2, just add the accepted date formats to the solution part of the error message. In case 3, I don't know if there is any filtering to exclude special characters.

Expected error messages:

Row 1, projectID, Cell is empty, Please provide a valid projectID
Row 1, sampleID, Special character detected, Remove any special characters
Row 2, collection_date, Invalid date format, Use YYYY-MM-DD, YYYY-MM, or YYYY
Row 2, SXT, Invalid data type, Use a valid integer
Row 3, TEIMIC, Invalid data type, Use a valid float
Row 3, source_type, Invalid value, Valid options: Human, Animal, Food, Environmental, Other, Missing, Not applicable, Not collected, Not provided
Row 4, source_type, Invalid value, Valid options: Human, Animal, Food, Environmental, Other, Missing, Not applicable, Not collected, Not provided
Row 5, sampleID, Cell is empty, Please provide a valid sampleID
Row 6, collection_date, Invalid date format, Use YYYY-MM-DD, YYYY-MM, or YYYY
Row 8, sampleID, Special character detected, Remove any special characters