ozwillo / ozwillo-datacore

Ozwillo Datacore is a Cloud of shared Open Linked Data. Its goal is cross-business data collaboration and integration. By linking data from different business together, it allows creating value by developing new Ozwillo services on top of it.
http://www.ozwillo.com
GNU Affero General Public License v3.0
3 stars 2 forks source link

Keep an eye on indexes size (geo:Area_0, org:Organization_0, geoname...) #156

Open mdutoo opened 8 years ago

mdutoo commented 8 years ago

Here are the biggest collections'index size:

rs0:SECONDARY> db["geoname_0.geoname:GeoDB_0"].totalIndexSize()
166258960
rs0:SECONDARY> db["geo_1.geo:Area_0"].totalIndexSize()
127480192
rs0:SECONDARY> db["org_1.org:Organization_0"].totalIndexSize()
21878976
rs0:SECONDARY> db["oasis.meta.dcmi:mixin_0"].totalIndexSize()
171696

Right now, it's OK: even geoname and geo, though mostly cold, static data, are an order of magnitude smaller (120-170MB) that the portal's geo replication data.

However, it's the different kinds of data that end up making up a big total in the Datacore. So we should keep an eye on it in the future (and on esp. Portal but also Kernel indexes, but also Datacore & Portal java processes which it has to share RAM with), in case it becomes a performance problem.

OzBruno commented 8 years ago

Don't forget that I will upload around 15 other countries in geoname including the biggest : United States and China.