genevaers / Research-And-Development

Apache License 2.0
2 stars 2 forks source link

How to transfer test data to mainframe #5

Open SaPeresi opened 4 years ago

SaPeresi commented 4 years ago

Files are sitting on ascii and will be converted to standard format there with UTF-8 possibly be compressed into a tarfile. Will PAX on mainframe unzip, uncompress and convert to EBCDIC?

bobmcc9 commented 4 years ago

Good Question

a. will they reside in USS, if so could stay as ASCII / UTF-8

b. if going into a MVS (aka z/OS) dataset then conversion to ebcdic, which could be handled by a suitable .gitatributes file and a git clone to USS

SaPeresi commented 3 years ago

To progress this work @KipTwitchell @SaPeresi and Neil are working four days a week before scrum meeting. Waiting on password resets at this time.

SaPeresi commented 3 years ago

Access to Vicom established. Data found on Data world and access available. Unable to FTP from data world to mainframe - there needs to be two methods - on to handle zip files and one to do non-zip file. FTP parameters still not known - ones used have not worked correctly.
Might be able to use a linux script - if we can get the unzip happening there. Randall will ask Bob tonight if he has any ideas.

bobmcc9 commented 3 years ago

Folks

a. if we look at a single CSV file from DATA WORLD FY11EXP.TXT

b. the original zip file was downloaded from DATA WORLD as a zip file to a Windows 10 platform

c. after a unzip here is a sample hex listing of the start of this file

"AGY_AGENCY_KEY" "FNDDTL_FUND_DETAIL_KEY" "OBJ_OBJECT_KEY" "SPRG_SUB_PROGRAM_KEY" "VENDOR_NAME" "AMOUNT" 241 10454 249 3048 "Payroll" 4.99 241 10454 249 3048 "Payroll" 4.99 241 10454 80 3048 "FedEx" 34.78 241 10454 249 3048 "Payroll" 4.99 241 10454 249 3048 "Payroll" 4.97 241 10454 95 3048 "Tata Institute Of Fundamental Research" 380

22 41 47 59 5f 41 47 45 4e 43 59 5f 4b 45 59 22 09 22 46 4e 44 44 54 4c 5f 46 55 4e 44 5f 44 45 54 41 49 4c 5f 4b 45 59 22 09 22 4f 42 4a 5f 4f 42 4a 45 43 54 5f 4b 45 59 22 09 22 53 50 52 47 5f 53 55 42 5f 50 52 4f 47 52 41 4d 5f 4b 45 59 22 09 22 56 45 4e 44 4f 52 5f 4e 41 4d 45 22 09 22 41 4d 4f 55 4e 54 22 0d 0a 32 34 31 09 31 30 34 35 34 09 32 34 39 09 33 30 34 38 09 22 50 61 79 72 6f 6c 6c 22 09 34 2e 39 39 0d 0a 32 34 31 09 31 30 34 35 34 09 32 34 39 09 33 30 34 38 09 22 50 61 79 72 6f 6c 6c 22 09 34 2e 39 39 0d 0a 32 34 31 09 31 30 34 35 34 09 38 30 09 33 30 34 38 09 22 46 65 64 45 78 22 09 33 34 2e 37 38 0d 0a 32 34 31 09 31 30 34 35 34 09 32 34 39 09 33 30 34 38 09 22 50 61 79 72 6f 6c 6c 22 09 34 2e 39 39 0d 0a 32 34 31 09 31 30 34 35 34 09 32 34 39 09 33 30 34 38 09 22 50 61 79 72 6f 6c 6c 22 09 34 2e 39 37 0d 0a 32 34 31 09 31 30 34 35 34 09 39 35 09 33 30 34 38 09 22 54 61 74 61 20 49 6e 73 74 69 74 75 74 65 20 4f 66 20 46 75 6e 64 61 6d 65 6e 74 61 6c 20 52 65 73 65 61 72 63 68 22 09 33 38 30 0d 0a

d, the original zip file was FTP to USS as BINARY

e. the file was unzipped to FY11q1exp.TXT and here is the hex value of this file

00000000: 2241 4759 5f41 4745 4e43 595f 4b45 5922 . å߬ åá+ä߬.áß. 00000010: 0922 464e 4444 544c 5f46 554e 445f 4445 ..ã+ààè<¬ãí+à¬àá 00000020: 5441 494c 5f4b 4559 2209 224f 424a 5f4f è ñ<¬.áß...|⢬| 00000030: 424a 4543 545f 4b45 5922 0922 5350 5247 â¢áäè¬.áß...ë&êå 00000040: 5f53 5542 5f50 524f 4752 414d 5f4b 4559 ¬ëíâ¬&ê|åê (¬.áß 00000050: 2209 2256 454e 444f 525f 4e41 4d45 2209 ...îá+à|ê¬+ (á.. 00000060: 2241 4d4f 554e 5422 0d0a 3234 3109 3130 . (|í+è......... 00000070: 3435 3409 3234 3909 3330 3438 0922 5061 ..............&/ 00000080: 7972 6f6c 6c22 0934 2e39 390d 0a32 3431 Ê?%%........... 00000090: 0931 3034 3534 0932 3439 0933 3034 3809 ................ 000000a0: 2250 6179 726f 6c6c 2209 342e 3939 0d0a .&/Ê?%%........ 000000b0: 3234 3109 3130 3435 3409 3830 0933 3034 ................ 000000c0: 3809 2246 6564 4578 2209 3334 2e37 380d ...ãÁÀáÌ........ 000000d0: 0a32 3431 0931 3034 3534 0932 3439 0933 ................ 000000e0: 3034 3809 2250 6179 726f 6c6c 2209 342e .....&/Ê?%%.... 000000f0: 3939 0d0a 3234 3109 3130 3435 3409 3234 ................ 00000100: 3909 3330 3438 0922 5061 7972 6f6c 6c22 ........&/Ê?%%. 00000110: 0934 2e39 370d 0a32 3431 0931 3034 3534 ................

f. using OCOPY to copy to a file in EBCDIC format in USS we have

"AGY_AGENCY_KEY"."FNDDTL_FUND_DETAIL_KEY"."OBJ_OBJECT_KEY"."SPRG_SUB_PROGRAM_KEY 7CCE6CCCDCE6DCE707CDCCED6CEDC6CCECCD6DCE707DCD6DCDCCE6DCE707EDDC6EEC6DDDCDCD6DCE F178D175538D258F5F654433D6454D453193D258F5F621D621533D258F5F2797D242D7967914D258

241.10454.249.3048."Payroll".4.99. FFF0FFFFF0FFF0FFFF07D8A999970F4FF0 2415104545249530485F7189633F54B99D

241.10454.249.3048."Payroll".4.99. FFF0FFFFF0FFF0FFFF07D8A999970F4FF0 2415104545249530485F7189633F54B99D

241.10454.80.3048."FedEx".34.78. FFF0FFFFF0FF0FFFF07C88CA70FF4FF0 241510454580530485F65457F534B78D

241.10454.249.3048."Payroll".4.99. FFF0FFFFF0FFF0FFFF07D8A999970F4FF0 2415104545249530485F7189633F54B99D

241.10454.249.3048."Payroll".4.97. FFF0FFFFF0FFF0FFFF07D8A999970F4FF0 2415104545249530485F7189633F54B97D

241.10454.95.3048."Tata Institute Of Fundamental Research".380. FFF0FFFFF0FF0FFFF07E8A84C9AA8AAA84D84CA988989A894D8A8898870FFF0 241510454595530485F31310952393435066064541455313095251938F5380D

SaPeresi commented 3 years ago

All reference data from Data world for VA data has been added to the vicom mainframe. It is in fixed format and under Geneva0,ref*


      GENEVA0.REF.AGENCY                                   
      GENEVA0.REF.FUND                                     
      GENEVA0.REF.OBJECT                                   
      GENEVA0.REF.PRGM                                     
      GENEVA0.REF.SOURCE                                   

*** End of Data Set list *

We have a working (*ahem) version of the workbench on machines using postgreSQL which we will start to play with to break.
I will add the lr's for these files.
@bobmcc9 Can we get one of the above mentioned files to the mainframe? Can you post the mainframe name of the file? I guess we also need some type of data dictionary for these files with file layouts including lengths and formats.

bobmcc9 commented 3 years ago

GENEVA0.REF.FUND may be truncated as it has 16 extensions . Last record says 9789 1392 HIGHER ED STATEWIDE ORIGINAL IDC RATE - ARRA GENEVA0.REF.SOURCE may be truncated as it has 16 extensions. Last record says 1290 93 FEDERAL GRANTS AND CONTRACTS

bobmcc9 commented 3 years ago

GENEVA0.FY11Q1.EXP.TXT has financial data in a type of CSV format "AGY_AGENCY_KEY"."FNDDTL_FUND_DETAIL_KEY"."OBJ_OBJECT_KEY"."SPRG_SUB_PROGRAM_KEY"."VENDOR_NAME"."AMOUNT". 241.10454.249.3048."Payroll".4.99.
241.10454.249.3048."Payroll".4.99.

bobmcc9 commented 3 years ago

Conversion

a. I have taken the CSV file GENEVA0.FY11Q1.EXP.TXT and converted it to a standard file with fix data columns

b. look in GENEVA0.FY11Q1.EXP.COLUMN

AGY_AGENCY_KEY Columns 1 - 8 unsigned numeric data FNDDTL_FUND_DETAIL_KEY Columns 9 - 16 unsigned numeric data OBJ_OBJECT_KEY Columns 17 - 24 unsigned numeric data SPRG_SUB_PROGRAM_KEY Columns 25 - 32 unsigned numeric data VENDORNAME Columns 33 - 113 character data AMOUNT Columns 114 - 124 signed numeric data in non numeric form - will need more massaging

SaPeresi commented 3 years ago

Thanks Bob - never noticed that I will look into it. Never noticed that it had gone to 16 extensions. Surprised the upload didn't show an error.
How did you put the agency up and a numeric field. When I uploaded the csv it became automatic alphanumeric. I can change it on the mainframe but just curious how you did it.
All my keys are 4 alpha - maybe be a good test case to move to numeric in view then do the lookup for one of them.

bobmcc9 commented 3 years ago

Packed Data

a. possibly to facilitate processing I have converted the numeric data in the CSV file a packed format

Setup packed fields using rexx

FileName GENEVA0.FY11Q1.EXP.COLUMN.PACK

----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8 ***** Top of Data ** 00000241000104540000024900003048Payroll
00000241000104540000024900003048Payroll
00000241000104540000008000003048FedEx
00000241000104540000024900003048Payroll
00000241000104540000024900003048Payroll
00000241000104540000009500003048Tata Institute Of Fundamental Research

----+----9----+----0----+----1----+----2 ***** Top of .....ñæ .....ñæ .....åð .....ñæ .....ñ@ .......

hex view of packed fields


.....ñæ 0000049 000009C

.....ñæ 0000049 000009C

.....åð 0000048 000037C

a. this is S9(11)V99 COMP-3 startting in record position 113.

SaPeresi commented 3 years ago

Source and Fund has been uploaded again - they now look complete. Thanks for catching this issue.

SaPeresi commented 3 years ago

We will change the data for the keys with dfsort over the next few days to make the keys 8 unsigned numeric.

SaPeresi commented 3 years ago

Bob quick question about the file you uploaded which may give us a problem. Wondering how much it would take to make the lower case all upper case and reupload and create the file. The mainframe doesn't deal well with lower case records. I am unaware of anything that will change that but even if there was, we will need to do this for more files.

SaPeresi commented 3 years ago

File Layouts and Data Dictionary
Keys Format Length VA_Exp
AGY_AGENCY_KEY numeric 8 FNDDTL_FUND_DETAIL_KEY numeric 8 OBJ_OBJECT_KEY numeric 8 SPRG_SUB_PROGRAM_KEY numeric 8 VENDOR_NAME alphanumeric 80 AMOUNT packed numeric 10 VOUCHER_DATE alphanumeric 8

VA_Ref_Agency
AGY_AGENCY_CODE Primary Key numeric 8 AGY_AGENCY_KEY Foreign Key numeric 8 AGY_AGENCY_NAME alphanumeric 80

VA_Ref_Fund
FNDDTL_FUND_DETAIL_KEY Primary Key numeric 8 FNDDTL_FUND_DETAIL_CODE alphanumeric 4 FNDDTL_FUND_DETAIL_NAME alphanumeric 58 FND_FUND_CODE alphanumeric 2 FND_FUND_NAME alphanumeric 80

VA_Ref_Object
OBJ_OBJECT_KEY Primary Key numeric 8 DIGITOBJECT CODE_2 alphanumeric 2 DIGIT_OBJECT_NAME_2 alphanumeric 110 DIGIT_OBJECT_CODE_4 alphanumeric 4 DIGIT_OBJECT_NAME_4 alphanumeric 110

VA_Ref_Src
SRC_SOURCE_KEY Primary Key numeric 8 SRCCLS_SOURCE_CLASS_CODE alphanumeric 4 SRCCLS_SOURCE_CLASS_NAME alphanumeric 58 SRC_SOURCE_CODE alphanumeric 4 ,, (filler) alphanumeric 36 SRC_SOURCE_NAME alphanumeric 110

VA_Ref_Prgm
SPRG_SUB_PROGRAM_KEY Primary Key numeric 8 PFCN_FUNCTION_CODE alphanumeric 1 PFCN_FUNCTION_NAME, alphanumeric 42 PRG_PROGRAM_CODE alphanumeric 3 PRG_PROGRAM_NAME alphanumeric 65 SPRG_SUB_PROGRAM_CODE alphanumeric 5 ,, alphanumeric 1 SPRG_SUB_PROGRAM_NAME alphanumeric 110

I have this is an excel that I will put in the google docs area but there is also other stuff in the excel and not sure if it all can be posted. Well I spent forever formatting that but it went back to original on display - go figure.

SaPeresi commented 3 years ago

The reference files are not yet fixed - there is an issue open for this. Sample program in works and cobol copybook for exp is available in Geneva2.sandy.cobol in Vicom.

bobmcc9 commented 3 years ago

Upper Case Enquiry from Sandy

a. i looked at the files I have been involved with on VICOM

GENEVA0.FY11Q1.EXP.COLUMN
GENEVA0.FY11Q1.EXP.COLUMN.PACK GENEVA0.FY11Q1.EXP.TXT

b. they have data in lower case, this is only file that I have unzipped from the voluminous zips from State of Virginia

c. could you refresh my meagre memory and indicate which file I should be looking at please

many kind thanx bob

SaPeresi commented 3 years ago

Bob, I believe it is the second in the list, the one with the pack as the lowest level qualifier. The mainframe and geneva on the mainframe does not like lower case letters. All mainframe files use upper case. Can you make your process create the lowercase to uppercase? If you document what you do to load a file - perhaps one of our newer folks would like to do the process to get the rest of the files up there.

SaPeresi commented 3 years ago

Sample program is complete for changing the key fields from alphanumeric (4) to numeric (8). It can be found in Geneva2.sandy.cobol(agyprm) and the compile link edit and execution can be found in Geneva2.sandy.jcl(exagypgm).

We still need to do ref.object, ref.fund, ref.src, and ref.prgm - probably using the cobol program as an example. This is a good first issue for a newer mainframe programmer to experience. Happy to explain the jobs to anyone wanting to pick that up.

bobmcc9 commented 3 years ago

Ah, I have spent most of my mainframe life getting upper case to lower case. Hence my confusion

so, I have changed my REXX and it now produces upper case

BROWSE GENEVA0.FY11Q1.EXP.COLUMN.PACK.UPPER Line 0000000000 Col Command ===> Scroll = ***** Top of Data * 00000241000104540000024900003048PAYROLL
00000241000104540000024900003048PAYROLL
00000241000104540000008000003048FEDEX
00000241000104540000024900003048PAYROLL
00000241000104540000024900003048PAYROLL
00000241000104540000009500003048TATA INSTITUTE OF FUNDAMENTAL RESEARCH 00000241000104540000052400003048PAYROLL
00000241000104540000052400003048PAYROLL
00000241000104540000052400003048PAYROLL
00000241000104540000052400003048PAYROLL
00000241000104540000052400003048JOURNAL ENTRY

at the moment, the REXX is very specific to this particular input CSV, conducted as a pilot project

Moving forward perhaps use of FIle Manager and subsequent tweaks may be better

SaPeresi commented 3 years ago

I think we are planning only on using one or with the advent of a date in the file - two layouts. So doing this with REXX might be ok? Your thoughts?