Closed Kunjahamed-P closed 4 years ago
Because you specified the ChoCSVWriter without record fields declarations it determines them before writing the first row. Known as automatic fields map. And this is done from the first source row and because there are only 2 items in the otherState-list there are only 2 columns written.
I am not sure if there is an option to use more source rows (2,3,4, all) for determining all possible fields because you have to wait till all fields are known before you can begin to write the header of the csv file.
I thought there is an option but i dont find it.
@Cinchoo please correct if i am wrong
This how you can get your expected output.
StringBuilder csv = new StringBuilder();
using (var r = ChoJSONReader.LoadText(json)
.WithJSONPath("$..getUsers[*]")
)
{
using (var w = new ChoCSVWriter(csv)
.WithFirstLineHeader()
.Configure(c => c.MaxScanRows = 2)
.Configure(c => c.ThrowAndStopOnMissingField = false)
)
{
w.Write(r);
}
}
Console.WriteLine(csv.ToString());
got it source rows (2,3,4, all) at last of CSV. is there is any option to near all column same if not ignore it, and one more question is there any performance issue when increase 'MaxScanRows' count. thank you for quick response
Them main problem is when use this #67 approch the column are come after second file column
got it source rows (2,3,4, all) at last of CSV. is there is any option to near all column same if not ignore it, and one more question is there any performance issue when increase 'MaxScanRows' count. thank you for quick response
I don't understand your first request.
second question ('MaxScanRows') - yes, it does some way. Try to provide optimal number.
Them main problem is when use this #67 approch the column are come after second file column
Please share with example. Thanks.
Example json is :
{
"data": {
"getUsers": [
{
"UserInformation": {
"Id": 1111122,
"firstName": "*****1",
"UserType": {
"name": "CP"
},
"primaryState": "MA",
"otherState": [
"MA",
"BA"
],
"createdAt": null,
"lastUpdatedDate": "2019-04-03T07:49:05.2827076-04:00"
}
},
{
"UserInformation": {
"Id": 3333,
"firstName": "*****3",
"UserType": {
"name": "CPP"
},
"primaryState": "MPA",
"otherState": [
"KL",
"TN",
"DL",
"AP",
"RJ"
],
"createdAt": null,
"lastUpdatedDate": "2019-12-03T07:50:05.2827076-05:00"
}
}
]
},
"errors": [
{
"message": "GraphQL.ExecutionError: 13614711 - NO__DATA",
"extensions": {
"code": "212"
}
},
]
}
the output is
UserInformation_Id,UserInformation_firstName,UserInformation_UserType_name,UserInformation_primaryState,**UserInformation_otherState_0,UserInformation_otherState_1**,UserInformation_createdAt,UserInformation_lastUpdatedDate,UserInformation_otherState_2,UserInformation_otherState_3,UserInformation_otherState_4
1111122,*****1,CP,MA,MA,BA,,4/3/2019 5:19:05 PM,,,
3333,*****3,CPP,MPA,KL,TN,,12/3/2019 6:20:05 PM,DL,AP,RJ
The column order is UserInformation_otherState_0,UserInformation_otherState_1 UserInformation_createdAt .........................UserInformation_otherState_2,UserInformation_otherState_3...
The problem i sUserInformation_otherState_2, UserInformation_otherState_3 are come only come at the last column Found another date conversion issue, the 'lastUpdatedDate' date is input json is '2019-04-03T07:49:05.2827076-04:00' but after it convert into '4/3/2019 5:19:05 PM'. In my case I need exactly what I passed.
the expected output is
UserInformation_Id,UserInformation_firstName,UserInformation_UserType_name,UserInformation_primaryState,UserInformation_otherState_0,UserInformation_otherState_1,UserInformation_otherState_2,UserInformation_otherState_3,UserInformation_otherState_4,UserInformation_createdAt,UserInformation_lastUpdatedDate
1111122,*****1,CP,MA,MA,BA,,,,,2019-04-03T07:49:05.2827076-04:00
3333,*****3,CPP,MPA,KL,TN,DL,AP,RJ,,2019-12-03T07:50:05.2827076-05:00
The date offset issue I faced when I am using new newtonsoft. the fix i done using following code...
JsonSerializerSettings jsonSerializerSettings = new JsonSerializerSettings
{
DateTimeZoneHandling = DateTimeZoneHandling.Utc,
DateParseHandling = DateParseHandling.DateTimeOffset
};
using (var r = ChoJSONReader.LoadText(json)
.WithJSONPath("$..getUsers[*]")
.Configure(c => c.JsonSerializerSettings = new JsonSerializerSettings
{
DateTimeZoneHandling = DateTimeZoneHandling.Utc,
})
)
{
using (var w = new ChoCSVWriter(csv)
.WithFirstLineHeader()
.Configure(c => c.MaxScanRows = 2)
.Configure(c => c.ThrowAndStopOnMissingField = false)
)
{
w.Write(r);
}
}
Not found any updates after 1.1.0.5-alpha9. second issue is fixed,thanks. is there any details documents about ChoETL
Pls get 1.1.0.5-alpha10 package, try & let me know.
The latest package at nuget.org is 1.1.0.5-alpha9 and was published 5 days ago.
Am 7. Januar 2020 06:13:46 MEZ schrieb Cinchoo notifications@github.com:
Pls get 1.1.0.5-alpha10 package, try & let me know.
-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/Cinchoo/ChoETL/issues/68#issuecomment-571437528
Awesome 👍 its working fine. 1.1.0.5-alpha10 version not show in update tab. Because it shows near version 1.
Found issue in Field order issue. When try with following JSON the column order not in correct order
{
"data": {
"getUsers": [
{
"personalInformation": {
"userId": 13610642,
"firstName": "***",
"languagesSpoken": null,
"state": [
"CA",
"IL"
]
}
},
{
"personalInformation": {
"userId": 13611014,
"firstName": "**",
"languagesSpoken": [
{
"name": "Afrikaans"
},
{
"name": "Albanian"
},
{
"name": "American Sign Language"
}
],
"state": [
"WA",
"TX",
"GA",
"MN",
"NV"
]
}
},
{
"personalInformation": {
"userId": 13611071,
"firstName": "***",
"languagesSpoken": [
{
"name": "Albanian"
},
{
"name": "Hindi"
},
{
"name": "Telugu"
},
{
"name": "Malayalam"
},
{
"name": "Tamil"
}
],
"state": [
"OK",
"AK",
"WA",
"MA",
"GA",
"MN"
],
}
},
{
"personalInformation": {
"userId": 13611074,
"firstName": "********",
"languagesSpoken": null,
"state": [
"AZ"
]
}
},
{
"personalInformation": {
"userId": 13611082,
"firstName": "******",
"languagesSpoken": [
{
"name": "Estonian"
},
{
"name": "Faroese"
},
{
"name": "English"
},
{
"name": "Hindi"
}
],
"state": [
"AK",
"CA",
"GA",
"IL",
"NC",
"NV",
"TX",
"OK",
"OR",
"MA",
"MN",
"MS",
"WA",
"WV",
"CO"
]
}
},
{
"personalInformation": {
"userId": 13611227,
"firstName": "**",
"languagesSpoken": [
{
"name": "Latvian"
},
{
"name": "English"
},
{
"name": "Fiji"
},
{
"name": "Hindi"
},
{
"name": "Japanese"
},
{
"name": "Sanskrit"
},
{
"name": "Zhuang"
}
],
"state": [
"CO",
"GA",
"IL",
"MN",
"MS",
"MA",
"NC",
"NV",
"OK",
"OR",
"WA",
"WV"
]
}
}
]
}
}
the output is
personalInformation_userId,personalInformation_firstName,personalInformation_languagesSpoken_0_name,personalInformation_languagesSpoken_1_name,personalInformation_languagesSpoken_2_name,personalInformation_languagesSpoken_3_name,personalInformation_state_0,personalInformation_state_1,personalInformation_state_2,personalInformation_state_3,personalInformation_state_4,personalInformation_state_5,personalInformation_state_6,personalInformation_state_7,personalInformation_state_8,personalInformation_state_9,personalInformation_state_10,personalInformation_state_11,personalInformation_state_12,personalInformation_state_13,personalInformation_state_14,personalInformation_languagesSpoken_4_name,personalInformation_languagesSpoken,personalInformation_languagesSpoken_5_name,personalInformation_languagesSpoken_6_name
13610642,***,,,,,CA,IL,,,,,,,,,,,,,,,,,
13611014,**,Afrikaans,Albanian,American Sign Language,,WA,TX,GA,MN,NV,,,,,,,,,,,,,,
13611071,***,Albanian,Hindi,Telugu,Malayalam,OK,AK,WA,MA,GA,MN,,,,,,,,,,Tamil,,,
13611074,********,,,,,AZ,,,,,,,,,,,,,,,,,,
13611082,******,Estonian,Faroese,English,Hindi,AK,CA,GA,IL,NC,NV,TX,OK,OR,MA,MN,MS,WA,WV,CO,,,,
13611227,**,Latvian,English,Fiji,Hindi,CO,GA,IL,MN,MS,MA,NC,NV,OK,OR,WA,WV,,,,Japanese,,Sanskrit,Zhuang
The 'personalInformation_languagesSpoken' column order not in correct order, but 'personalInformation_state' is the correct order
@Cinchoo did you get the issue?
Managed to tune up the auto scan logic to cover most of the scenarios. Please take the new package 1.1.0.5-beta1 and try it.
There may be some cases, framework may not be able to figure out each field type from the scan. In such case, you can subscribe to FileHeaderArrange
event to take control of the field list, orders etc.
Hope it helps.
Q1.) Its working but most of the nodes is not converted in csv,when try with previous version my csv data size is 600kb, try with the latest version it only 57kb. In new version order is working as expected, but most of the data are missing, thanks for your helping.
Q2) here 'error' node first and last node is different
StringBuilder csvErrors = new StringBuilder();
using (var errors = ChoJSONReader.LoadText(caqhJson)
.WithJSONPath("$..errors[*]")
.WithField("errors_message", jsonPath: "$.message")
.WithField("errors_extensions_code", jsonPath: "$.extensions.code")
.WithField("errors_locations", jsonPath: "$.locations[*]")
.WithField("errors_path", jsonPath: "$.path[*]")
)
{
var arrError = errors.ToArray();
int errorCount = arrError.Length;
using (var w = new ChoCSVWriter(csvErrors)
.WithFirstLineHeader()
.Configure(c => c.MaxScanRows = errorCount)
.Configure(c => c.ThrowAndStopOnMissingField = false)
)
{
w.Write(arrError);
}
}
when using above code the output header is "errors_message_0 errors_extensions_code_0 errors_locations_0_line errors_locations_0_column errors_path_0 errors_path_1 errors_path_2 errors_path_3 errors_path_4 errors_locations errors_path errors_path_5 " show unwanted '0' in header last and empty header ( 'errors_locations' and 'errors_path') . Q3) One more thing, is there any chance to get the output of the array will be delimited by a pipe delimiter. For above example 'state' node wants to get
personalInformation_state
CA|IL
WA|TX|GA|MN|NV
No need to get 'languagesSpoken' array need to get only normal array. Is there any configuration available, please let know? Q4) i did't get 'FileHeaderArrange event'. is there any doc or example?
Q1) Found the issue, put a fix. Take the new package 1.1.0.5-beta2 and try it.
Q2) I'm not clear on this question. But here is my understanding. You want error_message
to be treated as string (not array). You can do so by setting `isArray' flag to false.
StringBuilder csvErrors = new StringBuilder();
using (var errors = new ChoJSONReader("sample33.json")
.WithJSONPath("$..errors[*]")
.WithField("errors_message", jsonPath: "$.message", isArray: false)
.WithField("errors_extensions_code", jsonPath: "$.extensions.code", isArray: false)
.WithField("errors_locations", jsonPath: "$.locations[*]", isArray: false)
.WithField("errors_path", jsonPath: "$.path[*]")
)
{
var arrError = errors.ToArray();
int errorCount = arrError.Length;
using (var w = new ChoCSVWriter(csvErrors)
.WithFirstLineHeader()
.Configure(c => c.MaxScanRows = errorCount)
.Configure(c => c.ThrowAndStopOnMissingField = false)
)
{
w.FileHeaderArrange += (o, e) =>
{
var first = e.Fields.First();
e.Fields.RemoveAt(0);
e.Fields.Add(first);
};
w.Write(arrError);
}
}
Q3) Yes you can combine the state using pipe delimited. sample shows how
StringBuilder csv = new StringBuilder();
using (var r = ChoJSONReader.LoadText(json)
.WithJSONPath("$..getUsers[*]")
.Configure(c => c.JsonSerializerSettings = new JsonSerializerSettings
{
DateTimeZoneHandling = DateTimeZoneHandling.Utc,
//DateParseHandling = DateParseHandling.DateTimeOffset
})
)
{
using (var w = new ChoCSVWriter(csv)
.WithFirstLineHeader()
.Configure(c => c.MaxScanRows = 1)
.Configure(c => c.ThrowAndStopOnMissingField = false)
)
{
w.Write(r.Select(r1 =>
{
r1.personalInformation.state = String.Join("|", ((IList)r1.personalInformation.state).OfType<string>());
return r1;
}
));
}
}
Q4) See sample code in Q2.
Q1) Fixed but csv data is not correct When trying with latest the output is.
usersProfileDetail_usersStatus_name,usersProfileDetail_usersStatusDate,usersProfileDetail_attestId,usersProfileDetail_lastAttestationDate,usersProfileDetail_Id,usersLocation_0_activeLocation_0_ServiceName,usersLocation_0_activeLocation_0_ServiceType2Npi,usersLocation_0_activeLocation_0_ServicePrimaryTaxID,usersLocation_0_activeLocation_0_ServiceTaxId_0_taxIdNumber,usersLocation_0_activeLocation_0_ServiceTaxId_0_modifiedAt,usersLocation_0_activeLocation_0_ServiceTaxId_0_createdAt,usersLocation_0_activeLocation_0_parentOrganization,usersLocation_0_activeLocation_0_standardizedLocationID,usersLocation_0_activeLocation_0_locationType2Npi,usersLocation_0_activeLocation_0_locationTaxId_0_taxIdNumber,usersLocation_0_activeLocation_0_officeType_name,usersLocation_0_activeLocation_0_address_address1,usersLocation_0_activeLocation_0_address_address2,usersLocation_0_activeLocation_0_address_city,usersLocation_0_activeLocation_0_address_state,usersLocation_0_activeLocation_0_address_zipCode,usersLocation_0_activeLocation_0_address_country,usersLocation_0_activeLocation_0_address_addressStandardizedFlag,usersLocation_0_activeLocation_0_address_addressStandardizationSource,usersLocation_0_activeLocation_0_locationAccessibility_adaAccessibilityFlag,usersLocation_0_activeLocation_0_locationAccessibility_accommodateIntellectuallyCognitivelyDisabledFlag,usersLocation_0_activeLocation_0_LimitationFlag,usersLocation_0_activeLocation_0_Limitation_genderLimitationFlag,usersLocation_0_activeLocation_0_Limitation_modifiedAt,usersLocation_0_activeLocation_0_Limitation_createdAt,usersLocation_0_activeLocation_0_usersLocationDetail_0_primaryFlag,usersLocation_0_activeLocation_0_usersLocationDetail_0_modifiedAt,usersLocation_0_activeLocation_0_usersLocationDetail_0_createdAt,usersLocation_0_activeLocation_0_healthPlanSubmittedAddress,usersLocation_0_activeLocation_0_modifiedAt,usersLocation_0_activeLocation_0_createdAt,usersLocation_0_activeLocation_1_ServiceName,usersLocation_0_activeLocation_1_ServiceType2Npi,usersLocation_0_activeLocation_1_ServicePrimaryTaxID,usersLocation_0_activeLocation_1_ServiceTaxId_0_taxIdNumber,usersLocation_0_activeLocation_1_ServiceTaxId_0_modifiedAt,usersLocation_0_activeLocation_1_ServiceTaxId_0_createdAt,usersLocation_0_activeLocation_1_parentOrganization,usersLocation_0_activeLocation_1_standardizedLocationID,usersLocation_0_activeLocation_1_locationType2Npi,usersLocation_0_activeLocation_1_locationTaxId_0_taxIdNumber,usersLocation_0_activeLocation_1_officeType_name,usersLocation_0_activeLocation_1_address_address1,usersLocation_0_activeLocation_1_address_address2,usersLocation_0_activeLocation_1_address_city,usersLocation_0_activeLocation_1_address_state,usersLocation_0_activeLocation_1_address_zipCode,usersLocation_0_activeLocation_1_address_country,usersLocation_0_activeLocation_1_address_addressStandardizedFlag,usersLocation_0_activeLocation_1_address_addressStandardizationSource,usersLocation_0_activeLocation_1_locationAccessibility_adaAccessibilityFlag,usersLocation_0_activeLocation_1_locationAccessibility_accommodateIntellectuallyCognitivelyDisabledFlag,usersLocation_0_activeLocation_1_LimitationFlag,usersLocation_0_activeLocation_1_Limitation_genderLimitationFlag,usersLocation_0_activeLocation_1_Limitation_modifiedAt,usersLocation_0_activeLocation_1_Limitation_createdAt,usersLocation_0_activeLocation_1_usersLocationDetail_0_primaryFlag,usersLocation_0_activeLocation_1_usersLocationDetail_0_modifiedAt,usersLocation_0_activeLocation_1_usersLocationDetail_0_createdAt,usersLocation_0_activeLocation_1_healthPlanSubmittedAddress,usersLocation_0_activeLocation_1_modifiedAt,usersLocation_0_activeLocation_1_createdAt,usersLocation_0_rejectedLocation,errors_message,errors_extensions_code,errors_locations_line,errors_locations_column,errors_path_0,errors_path_1,errors_path_2,errors_path_3,errors_path_4,errors_path_5
User one,2018-02-14T05:04:38+00:00,1111,2019-04-03T07:49:05.2827076-04:00,2222,oz*****,,,000000000,2019-02-01T03:44:02+00:00,,,,,000000000,"Other ",vdgvd,,New york,CA,894165123,ALGERIA,,,,,,False,2019-02-01T03:44:43+00:00,,False,2019-02-01T03:43:46+00:00,,,,,dk************,,,000000000,2019-02-01T03:44:57+00:00,,,,,000000000,"Primary ",New building,,Near Burj Khalifa,AR,12345,ANDORRA,,,,,,False,2019-02-01T03:45:38+00:00,,True,,,,,,,GraphQL.ExecutionError: 34444444 - NOT_IN_ELIGIBLE,216,374,9,getUsers,11,usersxxxxxxx,0,abc,PrivilegeFlag
user two,2017-04-04T07:48:25+00:00,,2019-02-01T03:50:42.6049634-05:00,2222,oz*****,,,000000000,2019-02-01T03:44:02+00:00,,,,,000000000,"Other ",vdgvd,,New york,CA,894165123,ALGERIA,,,,,,False,2019-02-01T03:44:43+00:00,,False,2019-02-01T03:43:46+00:00,,,,,dk************,,,000000000,2019-02-01T03:44:57+00:00,,,,,000000000,"Primary ",New building,,Near Burj Khalifa,AR,12345,ANDORRA,,,,,,False,2019-02-01T03:45:38+00:00,,True,,,,,,,GraphQL.ExecutionError: 13610742 - NOT_IN_ELIGIBLE_STATUS,216,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,GraphQL.ExecutionError: Cannot return null for non-null type. Field:,ID: 123456,374,9,getUsers,11,usersxxxxxxx,0,abc,PrivilegeFlag
here 'usersLocation_0_activeLocation_0_ServiceName' second row value are repeating in first row i thing all 'activeLocation' node are repeating
Q2) working as expected, but a small issue in csv output When trying with above code the output is
errors_message,errors_extensions_code,errors_locations_line,errors_locations_column,errors_path_0,errors_path_1,errors_path_2,errors_path_3,errors_path_4,errors_path_5
GraphQL.ExecutionError: 34444444 - NOT_IN_ELIGIBLE,216,374,9,getUsers,11,usersxxxxxxx,0,abc,PrivilegeFlag
GraphQL.ExecutionError: 13610742 - NOT_IN_ELIGIBLE_STATUS,216,,,,,,,,
GraphQL.ExecutionError: Cannot return null for non-null type. Field:,ID: 123456,374,9,getUsers,11,usersxxxxxxx,0,abc,PrivilegeFlag
The last row 'errors_locations_line,errors_locations_column,errors_path_0,errors_path_1,errors_path_2,errors_path_3,errors_path_4,errors_path_5' column values are repeated in the first row. Q1 and Q2 issue may be same. Q3) Its working as expected, is there any general configuration available Q4) got it , thanks Thanks for your quick response
Got it, found the issue. Put a fix and take the package 1.1.0.5-beta3, try it.
Thanks for your help.
It's working
I am converting a JSON file to a CSV file. The JSON has multiple nested objects and large size. While converting, I am able to get all the values out of the JSON and into the CSV. However, array values are missing . I am using CHOETL library. the sample json is (original json is long and big size)
the 'otherState' array in first array is two but second array its four. its not showing is csv. the input json is long and nested hierarchy and mostly dynamic the code is
the output is
i want each user in different row, when try with all data in single row its working fine