NSSAC / CINES

CINES project repository
Other
1 stars 0 forks source link

snap_GetClustCfAll Output Error #223

Closed hcars closed 1 year ago

hcars commented 3 years ago

@dmachi I am seeing a similar problem to other issues where this script runs and, then, fails trying to send the output file. Checkout job ID: d78c8a76-90b9-4133-987c-dc209da0c29f. This is the standard out from it:


Begin Staging


Reading Job Information from job.json Retrieving 162774 bytes in 1 file Retrieving file from https://sciduct.bii.virginia.edu/fs/file/5a627663-397a-41f5-acfc-e5a81f1ef6b6/1 Downloaded file to input/inputFile_Graph


End Staging


Begin Primary 

==== echo inputs ==== name of graph file containing graph to operate on.: /input/inputFile_Graph graph type is Snap type, so one of: PNGraph, PUNGraph, PNEANet.: PUNGraph col index in graphName for the src (source) node of the edge (src, des).: 0 col index in graphName for the des (destination) node of the edge (src, des).: 1 If !=-1 then compute clustering coefficient only for a random sample of SampleNodes nodes: -1 output file name for variable DegToCCfV: /output/outputFile_DegToCCfV.out output file name for variable libraryMethodReturnTypeVariable: /output/outputFile_libraryMethodReturnTypeVariable.out ==== end echo ====

Time to complete analysis (s, hr): 0.0372614860534668 1.0350412792629666e-05 ----- good termination -----


End Primary 


Begin Completion


Reading Job Information from job.json Reading Job Definition Information from job_definition.json Creating container instance 'clustering.out.txt' [folder] at https://sciduct.bii.virginia.edu/fs/file/home/hcars/ Creating file 'outputFile_DegToCCfV.out' [snap_TFltPrV] at https://sciduct.bii.virginia.edu/fs/file/650f18d1-4bb1-4b28-a0a5-07a084805d1b Creating file 'outputFile_libraryMethodReturnTypeVariable.out' [list_primitive_dataType] at https://sciduct.bii.virginia.edu/fs/file/650f18d1-4bb1-4b28-a0a5-07a084805d1b Error pushing files to FileService: Error while creating container instance 'outputFile_libraryMethodReturnTypeVariable.out' [list_primitive_dataType] at https://sciduct.bii.virginia.edu/fs/file/650f18d1-4bb1-4b28-a0a5-07a084805d1b req: {'id': 0, 'jsonrpc': '2.0', 'method': 'delete', 'params': ['/home/hcars/clustering.out.txt', True]} URL: https://sciduct.bii.virginia.edu/fs New File: {'type': 'snap_TFltPrV', 'name': 'outputFile_DegToCCfV.out', 'provenance': {'source': 'SciDuctJobService', 'reference_id': 'd78c8a76-90b9-4133-987c-dc209da0c29f', 'input': {'SampleNodes': -1, 'desCol': 1, 'srcCol': 0, 'inputFile_Graph': '/resources/net.science/workshop_v1/ca-grqc.giant.uel'}, 'input_files': [{'name': 'inputFile_Graph', 'stored_name': 'ca-grqc.giant.uel', 'id': '5a627663-397a-41f5-acfc-e5a81f1ef6b6', 'type': 'PUNGraph', 'isContainer': False, 'version': 1, 'autometa': {'averageEdgeBetweennessCentrality': 7790.371032633, 'averageNodeAuthorityScoreHits': 0.0020296501, 'averageNodeBetweennessCentrality': 10495.1363636363, 'averageNodeDegree': 6.455988456, 'averageNodeEigenvectorCentrality': 0.0020297652, 'averageNodeHubScoreHits': 0.0020296501, 'averageNodeInDegree': 6.455988456, 'averageNodeOutDegree': 6.455988456, 'averageNodePageRank': 0.0002405002, 'destinationNodeIdColumn': 1, 'edgeDirectionality': 'undirected', 'estimatedGraphDiameter': 17.0, 'fracLargestScc': 1.0, 'fracLargestWcc': 1.0, 'fracNodesLargestKcore': 0.0105820106, 'fracNodesSmallestKcore': 1.0, 'fracSmallestScc': 1.0, 'fracSmallestWcc': 1.0, 'isEdgeAttributed': 0, 'isNodeAttributed': 0, 'isWeaklyConnected': True, 'kcoreSize50percentNodes': 3, 'largestKcore': 43, 'maxEdgeBetweennessCentrality': 217261.4434127171, 'maxNodeAuthorityScoreHits': 0.1555625125, 'maxNodeBetweennessCentrality': 508435.3540110303, 'maxNodeDegree': 81, 'maxNodeEigenvectorCentrality': 0.1555625127, 'maxNodeHubScoreHits': 0.1555625125, 'maxNodeInDegree': 81, 'maxNodeOutDegree': 81, 'maxNodePageRank': 0.0018195221, 'minEdgeBetweennessCentrality': 2.0, 'minNodeAuthorityScoreHits': 0.0, 'minNodeBetweennessCentrality': 0.0, 'minNodeDegree': 1, 'minNodeEigenvectorCentrality': 0.0, 'minNodeHubScoreHits': 0.0, 'minNodeInDegree': 1, 'minNodeOutDegree': 1, 'minNodePageRank': 4.78545e-05, 'numEdges': 13422, 'numNodes': 4158, 'numNodesLargestKcore': 44, 'numNodesSmallestKcore': 4158, 'numSccComponents': 1, 'numSccComponentsLargestSize': 1, 'numSccComponentsSmallestSize': 1, 'numWccComponents': 1, 'numWccComponentsLargestSize': 1, 'numWccComponentsSmallestSize': 1, 'sizeLargestScc': 4158, 'sizeLargestWcc': 4158, 'sizeSmallestScc': 4158, 'sizeSmallestWcc': 4158, 'smallestKcore': 0, 'sourceNodeIdColumn': 0}, 'usermeta': {}, 'hash': '94bdc2c43e40e24894b3ef7f922d04f3', 'size': 162774, 'compute_only': False}]}, 'isContainer': False, 'metadata_only': False, 'is_symbolic': False, 'id': 'a87128b6-406a-472d-82ae-3aea0e55533a', 'container_id': '650f18d1-4bb1-4b28-a0a5-07a084805d1b', 'owner_id': 'hcars', 'created_by': 'hcars', 'updated_by': 'hcars', 'creation_date': '2021-06-21T13:06:17', 'update_date': '2021-06-21T13:06:17', 'readACL': [], 'writeACL': [], 'computeACL': [], 'state': 'empty', 'locked': False, 'finalized': False, 'preserved': False, 'public': False, 'autometa': {}, 'usermeta': {}} Storing file data to https://sciduct.bii.virginia.edu/fs/file/a87128b6-406a-472d-82ae-3aea0e55533a from outputFile_DegToCCfV.out response: {"id":0,"result":"OK"}

dmachi commented 3 years ago

The error on this one is the output file type. It is listed here as “list_primitive_dataType”. The fileservice is expecting “snap_list_primitive_dataType”. In order to work around this issue for today, I have added “list_primitive_dataType” as a valid type. Can you try this again? Also, if you see this issue on any other methods, please do create tickets for them too (though perhaps you were referring to the other issues you already submitted which were slightly different). Thanks for all the reports.

On Jun 21, 2021, at 9:11 AM, Henry @.***> wrote:

@dmachi https://github.com/dmachi I am seeing a similar problem to other issues where this script runs and, then, fails trying to send the output file. Checkout job ID: d78c8a76-90b9-4133-987c-dc209da0c29f. This is the standard out from it:

Begin Staging

Reading Job Information from job.json Retrieving 162774 bytes in 1 file Retrieving file from https://sciduct.bii.virginia.edu/fs/file/5a627663-397a-41f5-acfc-e5a81f1ef6b6/1 https://sciduct.bii.virginia.edu/fs/file/5a627663-397a-41f5-acfc-e5a81f1ef6b6/1 Downloaded file to input/inputFile_Graph

End Staging Begin Primary ==== echo inputs ==== name of graph file containing graph to operate on.: /input/inputFile_Graph graph type is Snap type, so one of: PNGraph, PUNGraph, PNEANet.: PUNGraph col index in graphName for the src (source) node of the edge (src, des).: 0 col index in graphName for the des (destination) node of the edge (src, des).: 1 If !=-1 then compute clustering coefficient only for a random sample of SampleNodes nodes: -1 output file name for variable DegToCCfV: /output/outputFile_DegToCCfV.out output file name for variable libraryMethodReturnTypeVariable: /output/outputFile_libraryMethodReturnTypeVariable.out ==== end echo ====

Time to complete analysis (s, hr): 0.0372614860534668 1.0350412792629666e-05 ----- good termination -----

End Primary Begin Completion

Reading Job Information from job.json Reading Job Definition Information from job_definition.json Creating container instance 'clustering.out.txt' [folder] at https://sciduct.bii.virginia.edu/fs/file/home/hcars/ https://sciduct.bii.virginia.edu/fs/file/home/hcars/ Creating file 'outputFile_DegToCCfV.out' [snap_TFltPrV] at https://sciduct.bii.virginia.edu/fs/file/650f18d1-4bb1-4b28-a0a5-07a084805d1b https://sciduct.bii.virginia.edu/fs/file/650f18d1-4bb1-4b28-a0a5-07a084805d1b Creating file 'outputFile_libraryMethodReturnTypeVariable.out' [list_primitive_dataType] at https://sciduct.bii.virginia.edu/fs/file/650f18d1-4bb1-4b28-a0a5-07a084805d1b https://sciduct.bii.virginia.edu/fs/file/650f18d1-4bb1-4b28-a0a5-07a084805d1b Error pushing files to FileService: Error while creating container instance 'outputFile_libraryMethodReturnTypeVariable.out' [list_primitive_dataType] at https://sciduct.bii.virginia.edu/fs/file/650f18d1-4bb1-4b28-a0a5-07a084805d1b https://sciduct.bii.virginia.edu/fs/file/650f18d1-4bb1-4b28-a0a5-07a084805d1b req: {'id': 0, 'jsonrpc': '2.0', 'method': 'delete', 'params': ['/home/hcars/clustering.out.txt', True]} URL: https://sciduct.bii.virginia.edu/fs https://sciduct.bii.virginia.edu/fs New File: {'type': 'snap_TFltPrV', 'name': 'outputFile_DegToCCfV.out', 'provenance': {'source': 'SciDuctJobService', 'reference_id': 'd78c8a76-90b9-4133-987c-dc209da0c29f', 'input': {'SampleNodes': -1, 'desCol': 1, 'srcCol': 0, 'inputFile_Graph': '/resources/net.science/workshop_v1/ca-grqc.giant.uel'}, 'input_files': [{'name': 'inputFile_Graph', 'stored_name': 'ca-grqc.giant.uel', 'id': '5a627663-397a-41f5-acfc-e5a81f1ef6b6', 'type': 'PUNGraph', 'isContainer': False, 'version': 1, 'autometa': {'averageEdgeBetweennessCentrality': 7790.371032633, 'averageNodeAuthorityScoreHits': 0.0020296501, 'averageNodeBetweennessCentrality': 10495.1363636363, 'averageNodeDegree': 6.455988456, 'averageNodeEigenvectorCentrality': 0.0020297652, 'averageNodeHubScoreHits': 0.0020296501, 'averageNodeInDegree': 6.455988456, 'averageNodeOutDegree': 6.455988456, 'averageNodePageRank': 0.0002405002, 'destinationNodeIdColumn': 1, 'edgeDirectionality': 'undirected', 'estimatedGraphDiameter': 17.0, 'fracLargestScc': 1.0, 'fracLargestWcc': 1.0, 'fracNodesLargestKcore': 0.0105820106, 'fracNodesSmallestKcore': 1.0, 'fracSmallestScc': 1.0, 'fracSmallestWcc': 1.0, 'isEdgeAttributed': 0, 'isNodeAttributed': 0, 'isWeaklyConnected': True, 'kcoreSize50percentNodes': 3, 'largestKcore': 43, 'maxEdgeBetweennessCentrality': 217261.4434127171, 'maxNodeAuthorityScoreHits': 0.1555625125, 'maxNodeBetweennessCentrality': 508435.3540110303, 'maxNodeDegree': 81, 'maxNodeEigenvectorCentrality': 0.1555625127, 'maxNodeHubScoreHits': 0.1555625125, 'maxNodeInDegree': 81, 'maxNodeOutDegree': 81, 'maxNodePageRank': 0.0018195221, 'minEdgeBetweennessCentrality': 2.0, 'minNodeAuthorityScoreHits': 0.0, 'minNodeBetweennessCentrality': 0.0, 'minNodeDegree': 1, 'minNodeEigenvectorCentrality': 0.0, 'minNodeHubScoreHits': 0.0, 'minNodeInDegree': 1, 'minNodeOutDegree': 1, 'minNodePageRank': 4.78545e-05, 'numEdges': 13422, 'numNodes': 4158, 'numNodesLargestKcore': 44, 'numNodesSmallestKcore': 4158, 'numSccComponents': 1, 'numSccComponentsLargestSize': 1, 'numSccComponentsSmallestSize': 1, 'numWccComponents': 1, 'numWccComponentsLargestSize': 1, 'numWccComponentsSmallestSize': 1, 'sizeLargestScc': 4158, 'sizeLargestWcc': 4158, 'sizeSmallestScc': 4158, 'sizeSmallestWcc': 4158, 'smallestKcore': 0, 'sourceNodeIdColumn': 0}, 'usermeta': {}, 'hash': '94bdc2c43e40e24894b3ef7f922d04f3', 'size': 162774, 'compute_only': False}]}, 'isContainer': False, 'metadata_only': False, 'is_symbolic': False, 'id': 'a87128b6-406a-472d-82ae-3aea0e55533a', 'container_id': '650f18d1-4bb1-4b28-a0a5-07a084805d1b', 'owner_id': 'hcars', 'created_by': 'hcars', 'updated_by': 'hcars', 'creation_date': '2021-06-21T13:06:17', 'update_date': '2021-06-21T13:06:17', 'readACL': [], 'writeACL': [], 'computeACL': [], 'state': 'empty', 'locked': False, 'finalized': False, 'preserved': False, 'public': False, 'autometa': {}, 'usermeta': {}} Storing file data to https://sciduct.bii.virginia.edu/fs/file/a87128b6-406a-472d-82ae-3aea0e55533a https://sciduct.bii.virginia.edu/fs/file/a87128b6-406a-472d-82ae-3aea0e55533a from outputFile_DegToCCfV.out response: {"id":0,"result":"OK"}

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NSSAC/CINES/issues/223, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABOQ3XQHU64MSWP3L652TDTT43APANCNFSM47BRWXCA.

hcars commented 3 years ago

Hey, Dustin, this is running fine now, and the output file that shows the average clustering coefficient by degree as a vector looks normal.

hcars commented 3 years ago

Is there supposed a second output that is just a float showing the average clustering coefficient for the whole graph?

dmachi commented 3 years ago

It is possible that is the case. The way that method is currently setup is just to create two file outputs. It looks like one of those outputs might have been intended to be “job” output (the scalar), but it doesn’t write to the location that would make that happen.

On Jun 21, 2021, at 10:00 AM, Henry @.***> wrote:

Is there supposed a second output that is just a float showing the average clustering coefficient for the whole graph?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NSSAC/CINES/issues/223#issuecomment-865056120, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABOQ3XLS6FWX636U7OXRQDTT5AW7ANCNFSM47BRWXCA.

dmachi commented 3 years ago

@Lucaslhm Can you check to see if this was supposed to be job output?

Lucaslhm commented 3 years ago

based off our snap documentation archive:

https://github.com/NSSAC/cines-snap-texts/blob/master/snap-texts/GetClustCfAll.txt

I believe this should only return a file output containing a primitive list.

According to the code though: https://github.com/NSSAC/cines-snappy-generated/blob/snap_v2/src/GetClustCfAll/GetClustCfAll_v01.py

It's also writing snap.TFltPrV() as a variable called DegToCCfV.

It's been a while since I looked at the auto-generation code but this doesn't ring any bells as job output for me. Our system as I remember is supposed to be only single primitive values get output as job output (due to complications in generating job output automatically at the time).

I'd ping on @cckuhlman to verify this claim but I don't think this is supposed to be job output as written. I'm not confident there is supposed to be two file outputs even though the code is trying to. I might be misremembering the process though.

cckuhlman commented 3 years ago

I think (or thought) the code is good.

The return of the 3 ints in a list is written to file via the libraryMethodReturnTypeVariable variable. This should be considered file output because it is written to file.

Also, variable DegToCCfV, of type DegToCCfV, is signature type output, i.e., it is an output from the method. The format of this second file is lines of: int float

So is the problem, as I read it above, that the list of three variables should be job output? Seems cleaner to have two file outputs, since the output is more than a single scalar.

chris