Add support for Private Data Subelements (PDS) subfields

gsova-indev commented 8 months ago

Hello,

I would like to be able to get PDS subfields on the output dictionary when parsing ISO-8583 IPM files.

For example, in the following record, DE48 includes PDS105 which in turn includes 4 subfields, according to IPM specification:

{'MTI': '1644', 'DE24': '697', 'DE48': '010502500123120400000032062011010122001T', 'PDS0105': '0012312040000003206201101', 'PDS0122': 'T', 'DE71': 1}

File Type --> position 1-3, type: numeric, length: 3 --> value: 001 File Reference Date --> position 4-9, type: numeric, length: 6 --> value: 231204 Processor ID --> position 10-20, type: numeric, length: 11 --> value: 00000032062 File Sequence Number --> position 21-25, type: numeric, length: 5 --> value: 01101

Is it possible to add support for a configuration dictionary like the following example for DE48 to process each PDS to the parsing mechanism?

Configuration example:

config = {
     "bit_config": {
           "48 ": { "field_name": "Additional data", 
                      "field_type": "LLLVAR", 
                      "field_length": 0, 
                      "field_processor": "PDS",
                      "field_pds_subfields": {
                             "PDS0105": [
                                    {"field_name": "File Type", "field_type": "FIXED", "field_length": 3},
                                    {"field_name": "File Reference Date", "field_type": "FIXED", "field_length": 6, "field_python_type": "datetime", 
                                     "field_date_format": "%y%m%d"},
                                    {"field_name": "Processor ID", "field_type": "FIXED", "field_length": 11, "field_python_type": "string"},
                                    {"field_name": "File Sequence Number", "field_type": "FIXED", "field_length": 5, "field_python_type": "int"},
                             ]
                    }
           }
     }
}

With regards to output, I propose the following structure:

{'MTI': '1644', 
 'DE24': '697', 
 'DE48': '010502500123120400000032062011010122001T', 
 'PDS0105': '0012312040000003206201101', 
 'PDS0105_SUBFIELDS': {
     'File Type': '001',
     'File Reference Date': datetime.datetime(2023, 12, 4, 0, 0, 0),
     'Processor ID': '00000032062',
     'File Sequence Number': 1101
    },
 'PDS0122': 'T', 
 'DE71': 1
}

What do you think of my proposition? I can also contribute with several PDS subfields configurations, that I currently use in my project.

adelosa commented 8 months ago

Parsing of sub-elements is possible. Please contact me via email to discuss.

adelosa commented 8 months ago

Here is how I see PDS subfield parsing working if you want to submit a pull request for this feature.

The config for subfields will need to be in its own config element rather than attached to DE48 because the PDS fields are not constrained to a single DE element. There are a number of nominated DE fields that are assigned as PDS and any of these fields may have the PDS elements. Most of the time they all end up in DE48, but if DE48 overflows, then it will use other fields nominated as PDS.

In the config, I would like to use a key of "PDS" and it should look like the following:

config = {
   "1": {},
   ...
   "PDS": {
      "0105": {
         "subfields": {
            "1": {"field_start": 0, "field_name": "File Type", "field_length": 3, "field_python_type": "string"}
         }
    }
}

I'm not aware of variable type fields in subfields.. so no need for a field type. I also would like each config to stand alone. I don't like the list format as it means all fields need to be defined for it to work correctly. Explicit field definitions. You should look to reuse the code that already processes the DE fields if possible. (like function _string_to_pytype)

In terms of output from dumps, I would expect it to look like the following:

{'MTI': '1644', 
 'DE24': '697', 
 'DE48': '010502500123120400000032062011010122001T', 
 'PDS0105': '0012312040000003206201101', 
 'PDS0105_SF1': '001',
 'PDS0105_SF2': datetime.date(2023, 12, 4, 0, 0, 0),
 'PDS0105_SF3': '00000032062',
 'PDS0105_SF4': 1101,
 'PDS0122': 'T', 
 'DE71': 1
}

The dump dictionary should be flat with no nesting. The current format is flat and it allows easy conversions to tabular formats which is what 99.9% people use this for. "SF" = Subfield.

Other things:

Make sure you add tests and check the code coverage.
Make sure that you can go from the dict back to ISO format.
Make sure you update the docs in the docs folder.

That's all I can think of at the moment, but I am sure we will have more things to resolve. Any questions, let me know!

gsova-indev commented 8 months ago

@adelosa Thank you for the review on our proposal and for the clarifications on how you'd like this feature to be implemented.

We'd like to contribute to cardutil when time allows, you'll hear back from us soon :)

adelosa commented 3 months ago

Hi @gsova-indev. Any update?

adelosa / cardutil

Add support for Private Data Subelements (PDS) subfields #16