AKSW / cubeviz.ontowiki

OntoWiki component to visualize RDF-DataCubes
Other
34 stars 7 forks source link

CubeViz label strategy slows down workflow #178

Closed Rahien closed 11 years ago

Rahien commented 11 years ago

To fetch labels, CubeViz employs the strategy to

This approach does not scale with the number of target entities. In an example cube, of 754 x 63 observation, a call for 'getobservations' took 6.3 minutes to complete. The main problem seems to be in

cubeviz/classes/DataCube/Query.php: $entry ['__cv_niceLabel'] = $titleHelper->getTitle($mainKey);

Keep up the good work!

k00ni commented 11 years ago

@Rahien: Can you please list the specs of your local machine / server and the memory_limit-setting of your PHP installation?

MichaelMartin commented 11 years ago

I think, this is not really a problem of CubeViz, but more of the TitleHelper of OntoWiki. I will create an issue there, and try to handle bugfixes on the OntoWiki side.

Here it could be an improvement (in addition to limit the handled titles in config) if we additionally configurably disable titles for observations (they are only used for the legend.) Thuis means we only will receive titles of structural items such as dimension elements.

k00ni commented 11 years ago

Link to the discussion in OntoWiki about improvements in TitleHelper: https://github.com/AKSW/OntoWiki/issues/254

Rahien commented 11 years ago

On a virtual machine running Ubuntu 12.04 LTS, gave it 4GB of memory and 4 processor cores

cat /proc/cpuinfo, 4x processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 58 model name : Intel(R) Core(TM) i7-3720QM CPU @ 2.60GHz stepping : 9 microcode : 0x15 cpu MHz : 2591.709 cache size : 6144 KB fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc aperfmperf pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt aes xsave avx f16c rdrand hypervisor lahf_lm ida arat epb xsaveopt pln pts dtherm fsgsbase smep bogomips : 5183.41 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management:

cat /proc/meminfo MemTotal: 4042456 kB MemFree: 1086884 kB (running a virtuoso)

memory limit is currently set to 128M, I guess it could use a bit more memory...

k00ni commented 11 years ago

I would suggest to not get labels for observations. Usually there is no label on an observation, so that should be fine for the most cases. If there is one, it is possible in CubeViz 1.0 to see the label in the extended legend.

For the other parts of CubeViz: we are extensively using Object- and QueryCache to store results to decrease database queries. To fix this issue, i will remove TitleHelper in getObservations.

Any remarks?