Sveino / Inst4CIM-KG

Instance of CIM Knowledge Graph
Apache License 2.0
6 stars 1 forks source link

multiply instance data #117

Closed VladimirAlexiev closed 4 days ago

VladimirAlexiev commented 1 month ago

We want to multiply instance data in order to generate large datasets for performance testing.

griddigit-ci commented 1 month ago

I have a way to do this. You can just put the data we want to multiply somewhere and I will multiply it

VladimirAlexiev commented 4 weeks ago

@griddigit-ci I've described them here https://github.com/Sveino/Inst4CIM-KG/tree/develop/rdf-improved#sample-instance-data

dataset xml zip files FullModel triples largest largest file
ENTSO-E_Test_Configurations_v3.0.2 151M 19M 357 350 1844380 947208 RealGrid/RealGrid-Merged/RealGrid_EQ.xml
Nordic44 2.9M 15 12 35481 17420 CGMES_2_4/Nordic44_CGM_37a_EQ.xml

@Sveino and @griddigit-ci

Sveino commented 3 weeks ago

Nordic44 does have a better alignment with market result. The plan is to develop the more advance HVDC model in this. But currently there are no reason for including Nordic44 in regard to SHACL performance validation. Statnett's model is about 30MB zipped and 800MB unzip. I belive that there is a model of National Grid that can be used for real TSO model validation.

griddigit-ci commented 3 weeks ago

I did what we agreed last week. I took RealGrid and multiplied 100 times. On the was I faced some memory issues, but then tuning a bit the 64 GB RAM usage on my server machine it was possible to produce.

The data is here: https://1drv.ms/f/s!AhDObGm0xWObjJI3y0obO3j9L4TSRw?e=4CDbxL

RealGrid10 is 10 times multiply. There is 20 times, 50 times and 100 times I think I should be able to do even bigger, but not sure what the limit is. The EQ100 is 1.3 GB zip

in each of these grids 1 is the same as the original. So eventually you can also merge the 4 sets and you will get 177 times multiply effect.

VladimirAlexiev commented 3 weeks ago

Okay!

VladimirAlexiev commented 1 week ago

@griddigit-ci

griddigit-ci commented 4 days ago

Yes, you can remove the xml:base. Also that last line is OK to remove. On the mrid. It can cause erros, but not sure. It should not cause issues for now I guess. I will need to look at my functions. They were done for previous versions where we didn't have mrid and this is why this comes as not consistent

VladimirAlexiev commented 4 days ago

We think there's no rule to check that the node URI matches the mRID. Closing