Open hit-lacus opened 2 years ago
MDX for Kylin is a process and it is depends on Kylin for metadata. MDX for Kylin needs sync with Kylin for Cube metadata by calling Kylin's REST API, so previous version of Kylin needs applied following patched jars to make the sync process works well.
It should support MDX for Kylin out of box.( 4.0.2 is about to be released at 2022.06)
You have to replace one jar to make Kylin have the ability to response specific HTTP API(sync meta). If the version which you used can NOT be found here, please open a New Issue here.
Kylin Version | Patched jar link |
---|---|
Kylin 4.0.1 | kylin-server-base-4.0.1.jar |
Kylin 4.0.0 | kylin-server-base-4.0.0.jar |
Kylin 3.1.3 | kylin-server-base-3.1.3.jar |
Kylin 3.1.1 | kylin-server-base-3.1.1.jar |
Kylin 2.5.2 | kylin-server-base-2.5.2.jar |
Note: User must be started kylin firstly, then related
kylin.war
in${KYLIN_HOME}/tomcat/webapps
will be decompressed.
User need to replace the patched jar in ${KYLIN_HOME}/tomcat/webapps/kylin/WEB-INF/lib
as followed commands:
# started kylin at first !!!
${KYLIN_HOME}/bin/kylin.sh start
# Then do as followed
cd ${KYLIN_HOME}/tomcat/webapps/kylin/WEB-INF/lib
# backup original kylin-server-base-x.x.x.jar
cp kylin-server-base-x.x.x.jar kylin-server-base-x.x.x.jar.bak
# download the patched jar for mdx
## the needed link of jar above
wget ${link of kylin-server-base-3.1.1.jar}
# restart kylin
${KYLIN_HOME}/bin/kylin.sh restart
At the end, enjoy the mdx for kylin ~
Kylin Version | Package link |
---|---|
Kylin 4.0.2-SNAPSHOT (Spark3) | https://s3.cn-north-1.amazonaws.com.cn/public.kyligence.io/kylin/tar/apache-kylin-4.0.2-SNAPSHOT-bin-spark3.tar.gz |
Kylin 4.0.2-SNAPSHOT (Spark2) | https://s3.cn-north-1.amazonaws.com.cn/public.kyligence.io/kylin/tar/apache-kylin-4.0.2-SNAPSHOT-bin-spark2.tar.gz |
placeholder
placeholder-2
Hi, I have to use Java 11 (this version AdoptOpenJDK 11.0.11+9 ) and I am seeing this on mdx.sh start:
'Unrecognized VM option 'PrintGCDateStamps'
I think I need to use this: XX:+IgnoreUnrecognizedVMOptions
Where can I place this please? And/or any other advice?
Thanks
Leigh Tilley tilleytech.com
Hi @LeighTilley ,
you can check the file startup.sh in the codes
path of semantic-mdx/semantic-deploy/scripts/startup.sh, or if you had packaged, then you can set the JVM parameters in the path of $MDX_HOME/semantic-mdx/startup.sh
And append any jvm parameters in the line of
JAVA_OPTS="$jvm_xms $jvm_xmx -XX:+UseG1GC -XX:G1HeapRegionSize=4m -XX:MaxMetaspaceSize=512m -Dfile.encoding=UTF-8"
in the file of startup.sh
And please feel free to contact us.
Best regards.
Hi
OK thanks for the prompt response. I added it and will test later as first I realised I need to get Kylin itself running!
Basically on Friday I learnt that my client may stop using ActivePivot, which is the serverside OLAP tech I use. On client side I have custom C# MDX maker I wrote.
So I looked around for an alternative to use, or evaluate, and Kylin looks to fit the bill!
Did you ever see this message when trying to run Kylin?
'Please make sure the user has the privilege to run hive shell'
Thanks
Leigh Tilley tilleytech.com
Hi @LeighTilley ,
Kylin needs to read data in the hive, so need the user has the privilege. And Kylin also needs to check other bigdata component privileges if you run the ./bin/check-env.sh which will check env for kylin whether is ok.
You can get Kylin-related docs at https://kylin.apache.org/docs/.
mdx-kylin also provides a docker for users to a quick start, you can access https://hub.docker.com/repository/docker/apachekylin/apache-kylin-standalone or run commands as followed:
docker run -d \
-m 8G \
-p 7070:7070 \
-p 7080:7080 \
-p 8088:8088 \
-p 50070:50070 \
-p 8032:8032 \
-p 8042:8042 \
-p 2181:2181 \
apachekylin/apache-kylin-standalone:kylin-4.0.1-mondrian
If you are familiar with AWS, I have done a tool for kylin4 to quickly run on AWS. The tool project is at https://github.com/apache/kylin/tree/kylin4_on_cloud. This tool will help you to quickly start a Kylin4 on AWS and MDX for Kylin also can be configured to install and start too.
Best regards, Mukvin
Hey
Thanks so much for your prompt response.
I will check out the details.
I cannot use AWS as I am working for a client (investment bank) and it must be inside the bank and I am setting it all up on an RHEL server. :)
Leigh
Hey
OK yes I am making progress now. I just set the HADOOP_CONF_DIR but getting a fail on mkdir (yet I am the owner). I think I am almost there! :)
bash-4.2$ ./check-env.sh Retrieving hadoop conf dir... ...................................................[PASS] KYLIN_HOME is set to /data/apadmin/kylin/apache-kylin-4.0.1-bin-spark3 Checking hive ...................................................[PASS] Checking hadoop shell ...................................................[PASS] Checking hdfs working dir WARNING: log4j.properties is not found. HADOOP_CONF_DIR may be incomplete. mkdir: `/kylin': Input/output error ...................................................[FAIL] Failed to create /kylin. Please make sure the user has right to access /kylin bash-4.2$
I shall keep tweaking! :)
Leigh Tilley tilleytech.com
Hi @LeighTilley, You can check this doc: https://kylin.apache.org/docs/gettingstarted/kylin-quickstart.html, and make sure that /kylin which is in HDFS (this is a default working-dir for Kylin) exists, if not, you can create this dir manually. And give the privilege to /kyllin dir.
Best Regards. Mukvin
Hey
Ah thanks so much for your help. I will check and try this.
I just heard that it is really official that ActivePivot will not be allowed after October. So I will be migrating the ActivePivot project, server-side stuff, to Kylin.
Do you mind to answer these questions?
Can I store vector / double arrays? As I have these in my project at present How do I write custom measures? E.g. In my current project I can write something called a post-processor (Java class) for a custom measure.
E.g. daysTilMukvinsBirthday (exposed as a measure on the UI)
and in the Java class I obtain today's date and Mukvin's birthday and calculate the difference.
is it possible to have this sort of thing?
Thanks alot for your advice.
Leigh Tilley tilleytech.com
Hi, @LeighTilley , here are my answer:
"Can I store vector / double arrays?" No. Kylin don't support such data type. But maybe there will be some workarounds, I guess.
"How do I write custom measures?" Yes. You can create custom measures by creating MDX expression. Please check my article at https://medium.com/kyligence/how-to-use-excel-to-query-big-data-mdx-for-kylin-part-i-acba473c7a83 .
in the Java class I obtain today's date and Mukvin's birthday and calculate the difference? Yes. MDX for Kylin support date intelligence function such YTD, they could by found at https://kyligence.github.io/mdx-kylin/en/dataset/mdx_list.en.html.
is it possible to have this sort of thing? Yes. I think so.
Besides, I create a new slack workspace, and I wish to have to have a online meeting, you can join us by clicking https://join.slack.com/t/apache-rz23366/shared_invite/zt-1825cd9pe-wwz7bsIhtW~zdZWG~U82jg .
Hi @LeighTilley, here are my answers:
I am not clear about this real scene. For more about Kylin, please check the KYLIN doc at https://kylin.apache.org/
Yes, MDX for Kylin will help you custom measures easily, for more details at https://kyligence.github.io/mdx-kylin/en/dataset/design_dataset/s3_3_measure.en.html. You don't need to write any called a post-processor for a custom measure. I am sure that https://kyligence.github.io/mdx-kylin will make you clear about MDX for Kylin.
Yes, MDX for Kylin can sort the dimensions and measures. For more details at https://kyligence.github.io/mdx-kylin/en/integration/excel_function_list.en.html#sort-function. After querying on EXCEL(which is already made MDX for Kylin be the data source), you can click the EXCEL to sort the results.
Everything will be easy for users to query on EXCEL or other BI with MDX for KYLIN.
Hey
many thanks for your replies.
For Slack - yes I can speak to you. Monday, Thursday and Friday I am on client site (investment bank) however Tuesday and Wednesday I am at home working remotely so I can speak then. As i am not allowed to use Slack / externally reaching apps at the bank. I can speak to you from personal devices.
On my project, which has a big codebase, it is written in Java and using the ActivePivot codebase as the cube.
This is what i currently use on serverside: https://activeviam.com/activepivot/latest/docs/intro/overview/
It used to be a WAR file up until last year and now is a Java 11 SpringBoot launchable JAR at version 5.11.1. (we've been using it since v3.5 2010!!! So this is quite the situation! :) ).
I work on it in Eclipse and build it using Maven, usual sort of Java project.
we provide a daily set of risk reports to traders and the older/existing ones are constructed via Excel being launched via script, tons of VBA code writing and querying the cube, creating tons of tabs with various pivot tables and reports and then being sent to traders.
My new setup is that I have written custom C# XLL (Excel addin) with ExcelDNA that uses a DLL I also wrote in C# to take simple requests from user (as it's a function appearing in Excel) for dimensions, measures, filters etc and my C# constructs MDX on-the-fly to query the remote cube. :) (I have my own functions for CubeMDX, CubeTopCount, CubeBottomCount, DrillThroughQuery etc) which construct the appropriate MDX. It was a good way for me to learn MDX by having to make it from C# haha :).
I was asked to create this as we will use this embedded to make brand new dashboards/reports so we have no VBA anywhere! :)
As we have lots of financial data, some of the measures are standard (SUM, MAX etc) whereas others need to be custom.
For example, for every bond trade, the mark to market price is obtained from the bondStatic (referential/static data contained in a map/store and keyed by ISIN) so for every leaf level we take the ISIN dimension and use it to look up the mark to market price and return that (which is then aggregated by the aggregation engine).
We have various situations like this. The current cube holds 2 days worth of data, so we do things like a DIFF where the Java class code will retrieve day T2 and then do the simple calculation against T1.
For the double array question:
[1.2,2.3,3.2,4.5] | in the java class, we obtain CS01.array and return the sum (simple for loop etc)
ActivePivot can hold the CS01.array natively.
HI @LeighTilley,
Thx for your details comment. As I saw about the contents, you may run MDX for Kylin and into the new world about MDX, haha~ Please feel free, and you can join the slack, so we can communicate with each other immediately.
Best Regards. Mukvin
Hi @LeighTilley,
Could you provide a example data and a query SQL, so I can get some information about this scene.
Best Regards. Mukvin
Hi
Well inside ActivePivot, the data csv is multi-threaded loaded and parsed in custom Java code, and then inserted into the cube.
So at schema level, CS01 (Credit Sensitivity) is stored as an array of double (double[]) at column / dimension level in one of the internal datastores (as part of the snowflake schema).
Then there is a custom measure, CS01.Total, which obtains the core field from above, performs a for loop and returns the result.
We have many like this and then also more complicated calculations too which retrieve several core fields and/or other measures at different hierarchy levels.
So it is important to be able to do this kind of thing.
To query this cube, it is pure MDX. There is no SQL.
Hi @LeighTilley,
As you said, I wonder if you can create a table such as followed.
table name: cs01
id|val| 1| 0.01 2|0.02 3|0.03 x|... (Yes, it just converts the array [double()] to a column in a table)
Then You can use the Kylin to build a cube that adds the measure about sum(cs01.val) which is the same as the cs01.total.
If you query a SQL in KYLIN: select sum(val) from cs01
.You can get the result same as before you did.
Hi
I see what you mean. The data in the csv actually comes in as rows like above anyway. And it is transposed in Java code and placed in an array and the sensitivity datastore is mapped to the trade datastore via date and instrument value (so any deal with the same instrument value has same sensi etc).
I am continuing my setup.
After you mentioned HDFS I realised I was doing things in the wrong order. :)
I had not got Hadoop set and running.
I am following this guide: https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/SingleCluster.html
I am at the sbin/start-dfs.sh part
I see problem with key though:
localhost: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password). Starting secondary namenodes [cmxd2x01] cmxd2x01: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password). bash-4.2$
Did u experience this?
Hi @LeighTilley,
I suggest you can run the docker, it helps you to know Kylin and MDX for Kylin quickly. Because the docker will start a small Hadoop cluster for Kylin and MDX for Kylin.
https://github.com/Kyligence/mdx-kylin/issues/1#issuecomment-1111652211
Best regards. Mukvin
OK
Yes I saw the info about Docker, from you and also on Kylin website, but noticed it is jdk8 etc and i wanted to go through how I'd set it up for real. As my main concern is how i get my data in (presumably the actual data is stored in hadoop).
I just signed up to docker, and i know of docker but not done any yet. i am behind a firewall in an investment bank too so will see how i can actually pull the image in.
Hi
As my client is an investment bank I must wait whilst they install various things for me...
I got Docker on my RHEL server today although i cannot yet pull the image.
I've also requested for them to check the Hadoop/HDFS setup I began and also requested MySQL setup.
In the meantime, i realised I had not yet asked if I will be able to use my custom C# MDX tool. It expects to talk to / send MDX to an XMLA or other MDX endpoint.
Does MDX for Kylin have such an endpoint for to send MDX statements to?
Thanks
Leigh
Hi @LeighTilley,
You can send the request body as a followed example:
More details: https://kylin.apache.org/blog/2022/03/31/how-to-use-excel-to-query-kylin/#Call%20API%20to%20query%20MDX%20for%20Kylin
Hey
Lovely stuff. Thanks for the prompt response. ;)
@LeighTilley , MDX for Kylin provided Java API via OLAP4j, please check demo by https://github.com/apache/kylin/blob/mdx-query-demo/src/main/java/io/kyligence/mdxquerydemo/MdxQueryDemoApplication.java and https://kylin.apache.org/cn_blog/2022/04/20/kylin4-on-cloud-part2/ .
@hit-lacus
Great thanks alot. That could also be useful too as I also read/write/work with Java on my RHEL server.
My client-side C# MDX creator i wrote uses:
using Microsoft.AnalysisServices.AdomdClient;
Which packs up the connection and XML envelope etc.
Hi
I finally got Docker installed by internal Docker and Unix teams.
I can do this and it seems to launch with no error:
docker run -d \ -m 8G \ -p 7071:7071 \ -p 8088:8088 \ -p 50070:50070 \ -p 8032:8032 \ -p 8042:8042 \ -p 2181:2181 \ apachekylin/apache-kylin-standalone:4.0.0
and according to the guide I should be able to do this:
http://server:7071/kylin/login
But I do not. I've asked Unix to check that it is really up and that the port ranges are allowed (most probably blocked at mo as everything usually is at an investment bank until requested to be opened! :)
I just noticed this docker image in the guide too,
docker pull apachekylin/apache-kylin-standalone:kylin-4.0.1-mondrian
So I will try to get this.
Thanks
leigh
Hey
I am awaiting permissions on the docker container log area, as I suspect I need to check the container logs.
I did just retrieve the mondrian one, with mdx for kylin added, and i ran it. I still do not see anything if I try web UI though. I will come back here once I've seen the container logs. :)
docker run -d \ -m 8G \ -p 9090:9090 \ -p 9091:9091 \ -p 8088:8088 \ -p 50070:50070 \ -p 8032:8032 \ -p 8042:8042 \ -p 2181:2181 \ apachekylin/apache-kylin-standalone:kylin-4.0.1-mondrian
Dear @LeighTilley, here is my reply:
apache-kylin-standalone:kylin-4.0.1-mondrian
is the superset, while apachekylin/apache-kylin-standalone:4.0.0
is subset. So it contains everything you need, including Kylin and MDX for Kylin, so you don't need image apachekylin/apache-kylin-standalone:4.0.0
. The [docker doc] (https://kylin.apache.org/docs/install/kylin_docker.html) says : "The extra following service will start based on services of Quickly try Kylin started:", maybe it is not clear enough.-p 7071:7071
to -p 7071:7071
is not enough, because you need to modify configuration entry on kylin.properties
too.Hi
Ah yes of course, config changes. This is my first time using Docker, and as you've told me to try the image I managed to download both the standard image and the image with the MDX for Kylin built in.
As it is taking me some time to get pre-requisites installed for launching Kylin myself. I have Hadoop ready, but i am still working to get HDFS running prpperly..then I have to move on to Hive, Spark etc. I did request for MySQL and I am still waiting... :). I have not used any of these before so I am learning about where each config and setting is for each of them right now; lots to read :)
So in the meantime the Docker image seems to be the best approach to evaluate Kylin.
An image is self contained though; how can I change it? I can't right? I changed the port purely as I kept getting an error that 7070 was taken,
I will try it again.
Hey
I went back to the default useage and actually it is unclear which port is taken:
sudo docker run -d \
-m 8G \ -p 7070:7070 \ -p 7080:7080 \ -p 8088:8088 \ -p 50070:50070 \ -p 8032:8032 \ -p 8042:8042 \ -p 2181:2181 \ apachekylin/apache-kylin-standalone:kylin-4.0.1-mondrian 2b537f1d022654b69511462428fbbc319d5a796b9b3fd8018340b555fd6c237d docker: Error response from daemon: driver failed programming external connectivity on endpoint nifty_mestorf (70c4e58efe716451fba1faba42318eb5e8e9881612afe0e5df532b0084616403): Error starting userland proxy: listen tcp 0.0.0.0:7070: bind: address already in use.
OK
I remembered netstat and grep'd on 7070 and did find an older Kylin launched by docker that had not shown up in docker container ls commands.
Anyway, I force killed it and I've now run a fresh version:
bash-4.2$ sudo docker run -d -m 8G -p 7070:7070 -p 7080:7080 -p 8088:8088 -p 50070:50070 -p 8032:8032 -p 8042:8042 -p 2181:2181 apachekylin/apache-kylin-standalone:kylin-4.0.1-mondrian 00879d1cbd40ee8baaa90e7d823a916fea206e3aaffcf88660e7be9e64e67490 bash-4.2$
No errors, so I assume it is running. No content in the browser though. I'll chase infrastructure on permissions for the container logs folder to see if I can find any info in the logfile for this container above.
Mmmmm
I just learnt out how to 'get inside' the container, so I looked around and now know how to navigate. :)
I found the Kylin main log:
INFO: Initializing Spring FrameworkServlet 'kylin' May 10, 2022 9:13:55 PM org.apache.catalina.startup.HostConfig deployWAR INFO: Deployment of web application archive [/home/admin/apache-kylin-4.0.1-bin-spark2/tomcat/webapps/kylin.war] has finished in [27,077] ms May 10, 2022 9:13:55 PM org.apache.coyote.AbstractProtocol start INFO: Starting ProtocolHandler ["http-bio-7070"] May 10, 2022 9:13:55 PM org.apache.catalina.startup.Catalina start INFO: Server startup in 27132 ms
Yes all looks present and correct:
[root@00879d1cbd40 apache-kylin-4.0.1-bin-spark2]# cd tomcat [root@00879d1cbd40 tomcat]# cd webapps/ [root@00879d1cbd40 webapps]# ls kylin kylin.war [root@00879d1cbd40 webapps]# cd kylin [root@00879d1cbd40 kylin]# ls META-INF WEB-INF css fonts image images index.html js manifest.appcache routes.json worker-json.js [root@00879d1cbd40 kylin]#
I will keep looking as to me it looks as if I should be able to see the Kylin and MDX for Kylin Web UIs...
Hey hey :)
OK so where I was busy debugging it I got to read up on the docker configs etc and confirmed the ports were being exposed on host network.
I managed to connect to the web UIs as I'd realised I'd not used the fully qualified internal server name over in Paris (I'm in London). :)
Before I go over to the MDX for Kylin documentation, perhaps u can help me with the initial login (I can see both Kylin UI and MDX for Kylin UI) ? :)
I tried admin/admin and admin/kadmin
Thanks
Leigh
Hey Oh, don't worry I'm in :)
Hey
In the MDX for Kylin webUI I see:
Dataset Role Diagnosis Configuration
Looking at your tutorial, is there meant to be a Dataset menu item too? As I do not see anything that appears to 'see' the project 'learn_kylin' over on the Kylin webUI.
Thanks
Leigh
Hey
I think the above, in MDX for Kylin, not showing much, is because my cubes are disabled in Kylin.
I found 'enable' but this gives error:
Oops..
Failed to enable cube: kylin_sales_cube Caused by: Cube 'kylin_sales_cube# doesn't contain a ny READY segment.
Hey
Oh I've found Purge, Rebuild etc...so I am waiting for it to rebuild....
Hi @LeighTilley , You need to build the cube, something about Kylin, please refer to https://kylin.apache.org/docs/gettingstarted/kylin-quickstart.html. and How to build the sample cube, as followed image:
Hi
Yes, Build gave the error also, so I've since done 'Purge -> Rebuild...' and it's now green / ready ;)
Thx for reply ;)
Hi
I'm in MDX for Kylin UI. As I mentioned before, I do not see much I can do there...I created new user.
I found 'Sync task Status' in 'Configuration' so I ran that, thinking perhaps it needs a sync after the cube (re) build. It's still running though.
I'll wait til it finishes. :)
Hi
I am looking for this, but I do not see it in MDX for Kylin:
As my initial target is to check out MDX for Kylin and then to point my C# DLL / XLL for Excel Function at it and query the sample.
Hey
Is OK my browser was somehow only on the user and config area. There were no buttons back to the main 'Overview' area.
I manually changed the URL and now I see everything.
Hey
OK I have a 'Test' dataset.
As a quick test I tried to reach the xmla endpoint in Chrome (and it asked for a password (and I used ADMIN / KYLIN):
URL used: http://my remote RHEL server:7080/mdx/xmla/learn_kylin
Error shown:
This application has no explicit mapping for /error, so you are seeing this as a fallback.
Wed May 11 22:37:11 GMT+08:00 2022 There was an unexpected error (type=Method Not Allowed, status=405).
How do I prperly call the xmla please?
Thx
Leigh
Hey
Is OK I've worked it out and I now have a standard pivot in Excel and also my C# DLL retrieving Measures.. :) :) :)
Hi @LeighTilley,
As before told, please refer to https://github.com/Kyligence/mdx-kylin/issues/1#issuecomment-1118558905 which using postman
to access xmla endpoints.
Best Regards. Mukvin
Hey
Is OK I've worked it out and I now have a standard pivot in Excel and also my C# DLL retrieving Measures.. :) :) :)
Hi @LeighTilley , congratulation, looks like you have tried MDX for Kylin, and pivot in Excel can connect to MDX for Kylin well. So, what is your next question?
Hi
Thanks for helpful tips.
I will now keep reading and checking the demo project and analyse how I would create our schema and data in Kylin.
I will also continue to setup pre-requisites for Kylin and MDX for Kylin on my RHEL server so that I can create a proper production setup as needed.
Leigh
Download MDX for Kylin
Package of MDX for Kylin