Open haalasz opened 1 month ago
This is a good suggestion, but please specify what messages you would like to see instead, we will replace them in the code.
This is a good suggestion, but please specify what messages you would like to see instead, we will replace them in the code.
The user needs to know the following:
All this in a visually pleasing way.
This is a good suggestion, but please specify what messages you would like to see instead, we will replace them in the code.
The user needs to know the following:
* the location of the error (row number, column name) * the reason behind the error (data type is not matching, mandatory cell is left empty etc.) * possible solution (accepted date formats, accepted data types, fixed options in some columns etc.)
All this in a visually pleasing way.
This could work well with issue #40 .
This is a good suggestion, but please specify what messages you would like to see instead, we will replace them in the code.
The user needs to know the following:
* the location of the error (row number, column name) * the reason behind the error (data type is not matching, mandatory cell is left empty etc.) * possible solution (accepted date formats, accepted data types, fixed options in some columns etc.)
All this in a visually pleasing way.
It certainly helps that these general rules are spelled out clearly, but what I also was trying to ask you was to specify the second and third point for each relevant error message, i.e. tell us:
All of these one by one for each error message that you are not currently happy with. All error messages will display row name and column name, so you don't need to bother about mentioning that these pieces of information should be added. If there is an error message that already exists but doesn't behave as it should, like we saw in the case of run_date
where we got a date format message although no date was specified in the first place, please note this as well. If a current error message is too general and should be split up into two or more specific cases, make a note of that as well.
I'm not sure what to do about problems that are not checked by the application currently. I think it's probably better to open a new issue to indicate that you would like that feature to be added, but in borderline cases just mention them here. It doesn't matter that much, what's important is that you specify what needs to be added somewhere.
@haalasz @CsongorFreytag This applies to both of you, as well as anybody else who is using the application to validate their metadata and notices that the error messages are not as informative as they should be.
First of all, we would like to see the error messages line by line, because the current implementation is not good.
["Number of cells with invalid date: 0\nSampleID is necessary in row 1, column 'sampleID':\nSampleID is necessary in row 2, column 'sampleID':\nSampleID is necessary in row 3, column 'sampleID':\nSampleID is necessary in row 4, column 'sampleID':\nSampleID is necessary in row 5, column 'sampleID':\nSampleID is necessary in row 6, column 'sampleID':"] []
Second, the usage of Number of cells with invalid date: 0
error message is not necessary, it should be removed.
Third, the error message itself, where I see five different causes that cannot be fixed automatically and require user intervention.
.metagenomongo.csv
)1 - If mandatory cell left empty
SampleID is necessary in row 1, column 'sampleID':
row, column, error message, solution or action
Row 1, sampleID, Cell is empty, Please provide a valid sampleID
project_directory
, projectID
, sampleID
, collection_date
( issue #41 )2 - The date format is wrong
Invalid value in row 1, column 'collection_date': Expected data type: date
row, column, error message, solution or action
Row 1, collection_date, Invalid date format, Use YYYY-MM-DD, YYYY-MM, or YYYY
3 - The cell contains a special character where it should not be
row, column, error message, solution or action
Row 1, sampleID, Special character detected, Remove any special characters
4 - The cell contains a value other than the built-in list for the column (see .metagenomongo.csv
)
Invalid value in row 1, column 'source_type': Possible values are: '['Human', 'Animal', 'Food', 'Environmental', 'Other', 'Missing', 'Not applicable', 'Not collected', 'Not provided', '']
row, column, error message, solution or action
Row 1, source_type, Invalid value, Valid options: Human, Animal, Food, Environmental, Other, Missing, Not applicable, Not collected, Not provided
5 - The cell contains a value with the wrong data type
Invalid data type in row 1, column 'SXT': Expected data type: int
row, column, error message, solution or action
Row 1, SXT, Invalid data type, Use a valid int (OR FLOAT IF THE EXPECTED DATA TYPE IS FLOAT)
In some cases (1, 4 and 5) the messages are already there, they just need some formatting. In case 2, just add the accepted date formats to the solution part of the error message. In case 3, I don't know if there is any filtering to exclude special characters.
Expected error messages:
Row 1, projectID, Cell is empty, Please provide a valid projectID
Row 1, sampleID, Special character detected, Remove any special characters
Row 2, collection_date, Invalid date format, Use YYYY-MM-DD, YYYY-MM, or YYYY
Row 2, SXT, Invalid data type, Use a valid integer
Row 3, TEIMIC, Invalid data type, Use a valid float
Row 3, source_type, Invalid value, Valid options: Human, Animal, Food, Environmental, Other, Missing, Not applicable, Not collected, Not provided
Row 4, source_type, Invalid value, Valid options: Human, Animal, Food, Environmental, Other, Missing, Not applicable, Not collected, Not provided
Row 5, sampleID, Cell is empty, Please provide a valid sampleID
Row 6, collection_date, Invalid date format, Use YYYY-MM-DD, YYYY-MM, or YYYY
Row 8, sampleID, Special character detected, Remove any special characters
The current error messages in the app are often too vague, making it challenging for average users to understand and correct issues. For example, an error like
Invalid value in row 1, column 'run_date': Expected data type: date
lacks detail on acceptable formats.Enhance error messages to be more user-friendly by providing clear, actionable information. For instance, instead of just indicating a data type issue, the message should specify the acceptable formats (e.g.,
YYYY-MM-DD
,YYYY-MM
,YYYY
for date fields) and any other relevant details for each column.