When using Excel Source i would like to keep the types...

Snooz82 commented 3 years ago

tl;dr: is there a way to apply a datadriver Excel input file's cell format to the variable's format that is conveyed to the robot test?

Moin René,

first of all let me say that my users love the option of providing test data via Excel sheets. Great job with designing the datadriver library. There is however an issue that I have not yet found a solution for - maybe, you have an idea on how to solve this issue.

My main use case for using the data driver are API tests. Read: I have tons of potential input parameters which I expose to the users via an excel sheet.

The end user populates all required parameters (read: Excel cells).
My Robot script checks on a test-specific basis which of the test-specific cells are populated.
If the cell is populated, its value is then added to a Robot dictionary
That dictionary is later on converted to a JSON object which then is sent to my API.

This whole process works like a charm. However, there is an edge case which I need to cover but have not yet found a viable solution for:

Some of the APIs that I use require numeric content to be provided as string data: Read: my Excel cell will contain a value of e.g. 3 but I need to add the STRING value of "3" to the robot dictionary so that when it gets converted to the final JSON object.

Let's assume that I have a data driver Excel input file which contains two variable columns with just one row of data:

${VAR1} --> numeric value of 1
${VAR2} --> string value of 2 (e.g. cell's format was set to string,  Input was '2 in order to enforce string format)

When I check the data type 'the Python way', both variable's data types are returned as 'string':

Check Variable Type New
   [Arguments]   ${object}
   ${VARTYPE}=   Evaluate  type($object).__name__
   [Return]      ${VARTYPE}

This crude yet somewhat effective approach does work for most cases but the one in question:

Check Variable Type
    [Arguments]    ${object}

    Return From Keyword If        not "${object}"  NONE  ${EMPTY}

    ${result}  ${number_value}=   Run Keyword And Ignore Error   Convert To Number  ${object}
    ${isnumber}=  Run Keyword And Return Status  Should Be Equal As Strings  ${object}  ${number_value}

    ${result}  ${integer_value}=  Run Keyword And Ignore Error   Convert To Integer  ${object}
    ${isinteger}=  Run Keyword And Return Status  Should Be Equal As Strings  ${object}  ${integer_value}

    ${result}  ${bool_value}=     Run Keyword And Ignore Error    Convert To Boolean    ${object}
    ${isbool}=  Run Keyword And Return Status   Should Be Equal As Strings  ${object}  ${bool_value}

    Return From Keyword If        ${isnumber}   NUMBER        ${number_value}
    Return From Keyword If        ${isinteger}    INTEGER       ${integer_value}
    Return From Keyword If        ${isbool}        BOOLEAN       ${bool_value}
    Return From Keyword                                    STRING        ${object}

If this crude detection is applied, both variable's values are not detected as string but as integer - which is correct for the first case but not desired for the 2nd case where that value's format was set to 'string'

Q: apart from detecting these edge cases' formats with the help of e.g. leading format descriptors in the cell's value, is there any way for the data driver to get the cell's format based on its Excel formatting - and then return that cell's value in its respective format?

Note that this does not only affect numeric data types but also boolean types. E.g.

TRUE (as boolean) and 'TRUE (as string)

are both conveyed to the Robot Test via data driver with a 'string' data type

Danke Dir:slight_smile: Jörg

Snooz82 commented 3 years ago

Hallo Jörg,

Nice that you like it!

Regarding the DataTypes: At the moment the XLS(X) reader are configured so that they do convert all stuff to strings. But this could be made configurable.

So that this test keyword:

*** Keywords ***
Check Variables
    [Arguments]    ${var_1}    ${var_2}    ${var_name}   ${var_doc}   ${var_tags}
    Log To Console    \n
    Log To Console    \${var_1}: ${{type($var_1)}} - ${var_1}
    Log To Console    \${var_2}: ${{type($var_2)}} - ${var_2}
    Log To Console    \${var_name}: ${{type($var_name)}} - ${var_name}
    Log To Console    \${var_doc}: ${{type($var_doc)}} - ${var_doc}
    Log To Console    \${var_tags}: ${{type($var_tags)}} - ${var_tags}

With this excel file:

results in that output:

==============================================================================
Defaults Xlsx Data Type                                                       
==============================================================================
Test 123 :: €‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ                               

${var_1}: <class 'float'> - 1.0
${var_2}: <class 'str'> - Hello
${var_name}: <class 'datetime.time'> - 08:00:30
${var_doc}: <class 'int'> - 1
${var_tags}: <class 'str'> - ['tag1', 'tag2']
Test 123 :: €‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ                               | PASS |
------------------------------------------------------------------------------
default 3.41423 True :: ÖÄ?Üß!)=§$                                    

${var_1}: <class 'float'> - 3.41423
${var_2}: <class 'bool'> - True
${var_name}: <class 'str'> - WAHR
${var_doc}: <class 'float'> - 0.02
${var_tags}: <class 'str'> - ['tag1']
default 3.41423 True :: ÖÄ?Üß!)=§$                                    | PASS |
------------------------------------------------------------------------------
My Test Name                                                          

${var_1}: <class 'float'> - 13.0
${var_2}: <class 'bool'> - False
${var_name}: <class 'str'> - 123
${var_doc}: <class 'str'> - 
${var_tags}: <class 'str'> - []
My Test Name                                                          | PASS |
------------------------------------------------------------------------------
default 1.23 2012-10-05 00:00:00                                      

${var_1}: <class 'float'> - 1.23
${var_2}: <class 'datetime.datetime'> - 2012-10-05 00:00:00
${var_name}: <class 'str'> - 05.06.21
${var_doc}: <class 'str'> - 
${var_tags}: <class 'str'> - []
default 1.23 2012-10-05 00:00:00                                      | PASS |
------------------------------------------------------------------------------
Defaults Xlsx Data Type                                               | PASS |
4 tests, 4 passed, 0 failed
==============================================================================

If you would like to have the option preserve_xls_types=True for xlsx_reader could you please raise an issue? on Github?

Anyway: You can also already now use the RF type notation for int ${1}, float ${1.0}, and boolean ${true}. or Python evaluations ${{2+int("3")}}

Cheers

Snooz82 commented 3 years ago

Original Discussion here: https://forum.robotframework.org/t/get-a-cells-data-type-from-a-datadriver-excel-input-sheet-based-on-its-cell-format/2331

Snooz82 commented 3 years ago

MS Excel and typed cells Microsoft Excel xls or xlsx file have the possibility to type thair data cells. Numbers are typically of the type float. If these data are not explicitly defined as text in Excel, pandas will read it as the type that is has in excel. Because we have to work with strings in Robot Framework® these data are converted to string. This leads to the situation that a European time value like "04.02.2019" (4th January 2019) is handed over to Robot Framework® in Iso time "2019-01-04 00:00:00". This may cause unwanted behavior. To mitigate this risk you should define Excel based files explicitly as text within Excel.

Alternatively you may deactivate that string conversion. To do so, you have to add the option preserve_xls_types to True. In that case, you will get str, float, boolean, int, datetime.time, datetime.datetime and some others.

*** Settings ***
Library    DataDriver    file=my_data_source.xlsx    preserve_xls_types=True

joergschultzelutter commented 3 years ago

Works like a charm - thanks for applying this change!

joergschultzelutter commented 3 years ago

Minor addendum:

When preserve_xls_types=True is used in combination with empty Excel cells, the cell's variable type is always returned as str. If you need to detect these empty Excel cells in Robot and rather prefer a fake None data type over a str data type, the following workaround might be helpful:

Get Data Type
    [Arguments]    ${object}
    Return From Keyword If    not "${object}"    None

    ${VARTYPE}=     Evaluate     type($object).__name__
    [Return]        ${VARTYPE}

Snooz82 / robotframework-datadriver

When using Excel Source i would like to keep the types... #57