P1sec / pycrate

A Python library to ease the development of encoders and decoders for various protocols and file formats; contains ASN.1 and CSN.1 compilers.
GNU Lesser General Public License v2.1
382 stars 132 forks source link

Error while parsing custom CDR format #237

Closed fuzemobi closed 1 year ago

fuzemobi commented 1 year ago

I have been working on a CDR to TAP conversion project and through trial and error have partially decoded a custom vendor CDR format to json. The issue is that the vendor format has some unparsable records that I have been unable to skip. I am able to skip the header when parsing the file resulting in a clean json of the first record.

I have 2 questions

  1. How can I skip the 5 byte block between my records that I want to parse?
  2. Is there a native CDR to TAP3 way to convert to a TAP3 version ASN file?

Setup OS: macOS M1 Ventura and UBUNTU 20,22 ASN: vendor.asn.zip CDR: sample.cdr.zip CODE:

Problem I used the ASN compiler to test the GPRSRecord.py import as well with no luck.

The decoder example uses the default CDR format in (GPRSChargingDataTypes.asn).
I used the parser example from [asn1.io] (https://asn1.io/CDR-inspector/default.aspx) to inspect the CDR visually.

Screenshot 2023-06-02 at 8 00 55 AM

As you can see there is a 59 byte header at the start of the file and a 5 byte header in between each record. To get a valid json, I had to skip the first header. The header is not defined in the vendor.asn file. Nor is the block of 5 bytes between the header.

# skip first 59 bytes to bypass invalid header
 fd.seek(59, 0)

Here is an example of the output of the to_json parser:

python  decoder_example.py -i ~/Downloads/sample.cdr

starting processor...
{
 "sGWRecord": {
  "accessPointNameNI": "vzwadmin",
  "apnSelectionMode": "mSorNetworkProvidedSubscriptionVerified",
  "causeForRecClosing": 17,
  "chChSelectionMode": "roamingDefault",
  "chargingCharacteristics": "0008",
  "chargingID": 371602685,
  "duration": 20,
  "dynamicAddressFlag": true,
  "listOfTrafficVolumes": [
   {
    "changeCondition": "recordClosure",
    "changeTime": "2305230825552d0800",
    "dataVolumeGPRSDownlink": 0,
    "dataVolumeGPRSUplink": 0,
    "ePCQoSInformation": {
     "aPNAggregateMaxBitrateDL": 400000,
     "aPNAggregateMaxBitrateUL": 400000,
     "aRP": 13,
     "qCI": 8
    }
   }
  ],
  "mSTimeZone": "0000",
  "pDNConnectionChargingID": 371602685,
  "pdpPDNType": "0121",
  "rATType": 6,
  "recordOpeningTime": "2305230825352d0800",
  "recordSequenceNumber": 21,
  "recordType": 84,
  "s-GWAddress": {
   "iPBinaryAddress": {
    "iPBinV4Address": "0a3a011d"
   }
  },
  "servedIMEI": "1111111111111111111",
  "servedIMSI": "1111111111111111111",
  "servedMSISDN": "1111111111111111111",
  "servedPDPPDNAddress": {
   "iPAddress": {
    "iPBinaryAddress": {
     "iPBinV4Address": "0a874e7b"
    }
   }
  },
  "servingNodeAddress": [
   {
    "iPBinaryAddress": {
     "iPBinV4Address": "0a3a0126"
    }
   }
  ],
  "servingNodeType": [
   "mME"
  ],
  "userLocationInfoTime": "2305230812352d0800",
  "userLocationInformation": "82130165006413016500004e1f"
 }
}

The parser stops parsing when it gets to the 5 byte block in between each header. Is there a way to define this with the vendor.asn spec file? When I compiled their file, there is no reference to any of the headers. I also tried to use a byte replace for the values in the TLV but my code breaks the lengths resulting in a failed result.

Also, I am not familiar with how to do a conversion from a CDR output to TAP3. Is there any example that can perform this that isn't a manual mapping from CDR to TAP?

Thanks in advance. Please let me know if there is any other input you need.

fuzemobi commented 1 year ago

I did a test on the TAP3 format using:

from pycrate_asn1dir.TAP3 import *
fd = open('sample.cdr', 'rb')
fd.seek(59, 0) # skip first 59 char
buf = fd.read()
fd.close()
tap3 = TAP_0312.DataInterChange
tap3.from_ber(buf) # from a fileopen buffer
# tap3.from_json(cdrJson) # from a valid GPRSRecord json
print(tap3.show())

I guess I didn't really expect that the parser would be able to convert without some mapping.

starting processor...
CHOICE._decode_ber_cont: DataInterChange, unknown extension tag (2, 78)
<~ASN1~: _ext_2178 : 'b'800154830813410876444476f8a40680040a3a011d8504162634fda60680040a3a01268708767a7761646d696e88020121a908a00680040a874e7b8b01ffac28302683010084010085010286092305230825552d0800a91081010886010d8703061a808803061a808d092305230825352d08008e01148f011191011595010096062102226803f1970200089801049d0853195844002369f09e01069f1f0200009f200d82130165006413016500004e1fbf23030a01059f2804162634fd9f34092305230812352d0800''H>
p1-bmu commented 1 year ago

Thanks for your feedback. The sample.cdr file format seems to contain effectively BER-encoded GPRSRecord structures, consistent with their ASN.1 definition. This does not seem to correspond to TAP3 structures however (which are also BER-encoded).

In your decoding routine, you can loop on the decoding of GPRSRecord like this:

    print("\nstarting processor...")

    gprs = GPRSChargingDataTypes.GPRSRecord
    char = Charpy(buf)
    it   = 0

    while char.len_byte() >= 2:
        try:
            cur = char._cur
            gprs.from_ber(char)
        except Exception as err:
            print('GPRSRecord BER decoding error: %r' % err)
            print('Exiting at file offset %i' % (cur>>3, ))
            # exit the while loop
            break
        else:
            #print(gprs.to_json())
            print(80*'-' + '\n' + 'GPRSRecord %i:\n%s\n' % (it, gprs.to_asn1()))
            #
            it += 1
            # jump over the 5 interleaved bytes
            char._cur += 5 * 8

What leads to the following:

GPRSRecord BER decoding error: ASN1ObjErr("PDPAddress.iPAddress.iPBinaryAddress.iPBinV6Address.iPBinV6Address: value out of size constraint, b'&\\x00\\x10\\x0c\\x82\\n\\xb3u'")
Exiting at file offset 9799

So the decoder exits due to a BER decoding error. It seems an IPv6 is badly encoded there. To overcome this issue, you need to disable bound checking for ASN.1 objects. You can do it with the following import and configuration at the beginning of your script:

from pycrate_asn1dir.CDR import *
from pycrate_asn1rt.asnobj import ASN1Obj
ASN1Obj._SAFE_BND = False

With this, you will decode the entire file into 116 GPRSRecords.

fuzemobi commented 1 year ago

@p1-bmu - Thank you for this. I was able to extract a discrete sGWRecord object for all records.

If you have any ideas on how to do a safe mapping here for CDR to TAP I would really appreciate it. If not, when I finish my data mapping I'll share my approach and see if there is a way to contribute back.

Best, Chad

p1-bmu commented 1 year ago

I have no idea for the mapping to TAP3, and generally I am missing experience with CDRs. But yes, if you build something with pycrate that can be useful to any other user: please let me know.

Thanks Benoit

fuzemobi commented 1 year ago

Another question:

I am getting a lot of "init_modules: different OID objects" errors when creating the gprs = GPRSChargingDataTypes.GPRSRecord and gprs.from_ber(char)

When I try to set these:

declare global variables

ASN1Obj._SAFE_BND = False ASN1Obj._SILENT = True

I am still seeing the errors. I manually commented this in the init.py but can't seem to find a safe way to do this otherwise.

Any suggestions?

p1-bmu commented 1 year ago

https://github.com/P1sec/pycrate/commit/47aaef8528d459e4df68a9854079c76679c7daec: will fix silencing the OID warning at module initialization.