HiromuHota / pentaho-kettle

webSpoon is a web-based graphical designer for Pentaho Data Integration with the same look & feel as Spoon
https://hub.docker.com/r/hiromuhota/webspoon/
Apache License 2.0
488 stars 185 forks source link

User configuration management #27

Open HiromuHota opened 7 years ago

HiromuHota commented 7 years ago

This is a discussion place for Configuration-Management in Wiki. Please feel free to write your comments here.

HiromuHota commented 7 years ago

Configuration management of Hadoop cluster should also be considered. Currently only one Hadoop cluster can be active at a time within a webSpoon instance. See here for more details.

HiromuHota commented 7 years ago

As of e1067568f2fc991c5379c0b884844584b93fed43, the webspoon-7.1_multiuser branch (not merged yet) changes some of the configuration files from shared (by all users) to dedicated (to each user).

Item Description As of e1067568f2fc991c5379c0b884844584b93fed43 Desired Comments
kettle.properties Main PDI properties file; contains global variables for low-level PDI settings Shared Shared This affects all users; hence, should be managed only by admin.
shared.xml Shared objects file Dedicated <--
db.cache The database cache for metadata Dedicated <--
repositories.xml Connection details for PDI database or solution repositories Shared Dedicated It takes more engineering effort to change.
.spoonrc User interface settings, including the last opened transformation/job Dedicated <--
.languageChoice Default language for the PDI client tool Shared Dedicated It takes more engineering effort to change, but one global setting may be just fine.
xulSettings.properties Not documented Dedicated <--
HiromuHota commented 7 years ago

The directory structure of the configuration files is below. In addition to the configuration files, data directory is created for each user to locally store Kettle files.

$HOME/.kettle/
├── .languageChoice
├── kettle.properties
├── repositories.xml
└── users
    ├── user1
    │   ├── .spoonrc
    │   ├── data
    │   │   └── Untitled.ktr
    │   ├── db.cache-7.1.0.0-12
    │   ├── shared.xml
    │   ├── shared.xml.backup
    │   └── xulSettings.properties
    └── user2
        ├── .spoonrc
        └── data
HiromuHota commented 7 years ago

The $HOME/.pentaho should also be considered, especially $HOME/.pentaho/metastore/pentaho has some configurations.

$HOME/.pentaho/metastore/pentaho/
├── Data\ Service\ Transformation
│   └── hoge.xml
├── Default\ Run\ Configuration
│   └── Carte.xml
├── Kettle\ Data\ Set
│   ├── dataset1.xml
│   └── output.xml
├── Kettle\ Data\ Set\ Group
│   ├── datasetgroup.xml
│   └── test.xml
├── Kettle\ Transformation\ Unit\ Test
│   ├── test1.xml
│   └── test2.xml
├── NamedCluster
│   └── my.xml
└── Spark\ Run\ Configuration
    └── Configuration\ 1.xml
HiromuHota commented 6 years ago

Finally merged webspoon-7.1_multiuser at 11e4d19e8c21aa80ce5f65d4de3779e50b0490b2. Also a22adb9bf6bab1b2a25bf219d77dc600a8ca6baa makes the metastore folder for each user. This is valid only when user authentication is configured like in a way described as https://github.com/HiromuHota/pentaho-kettle#user-authentication.

The .pentaho folder structure is like this:

$HOME/.pentaho/
├── caches
│   ├── ehcache
│   └── libfonts2
├── classic-engine
│   ├── system
│   └── user
├── metastore
│   └── pentaho
│       ├── Data\ Service\ Transformation
│       │   └── hoge.xml
│       ├── Default\ Run\ Configuration
│       │   └── Configuration\ 1.xml
│       ├── NamedCluster
│       │   └── hoge.xml
│       └── Spark\ Run\ Configuration
│           └── Configuration\ 2.xml
└── users
    ├── chrome
    │   └── metastore
    │       └── pentaho
    │           ├── Data\ Service\ Transformation
    │           │   └── hogechrome.xml
    │           ├── Default\ Run\ Configuration
    │           │   └── Configuration\ 1.xml
    │           ├── Kettle\ Data\ Set
    │           │   └── ge.xml
    │           ├── Kettle\ Data\ Set\ Group
    │           │   └── test.xml
    │           ├── Kettle\ Transformation\ Unit\ Test
    │           │   └── test.xml
    │           ├── NamedCluster
    │           │   └── hoge.xml
    │           └── Spark\ Run\ Configuration
    │               └── Configuration\ 22.xml
    └── safari
        └── metastore
            └── pentaho
                └── Default\ Run\ Configuration
                    └── Configuration\ 1.xml
HiromuHota commented 6 years ago

I updated the Configuration-Management in wiki to reflect the changes in 0.7.1.11.

HiromuHota commented 6 years ago

21ffc3e32bacc4355502b5dfffb0c4a9a7e5f651 made repositories.xml multi-user enabled.

Item Description As of 21ffc3e32bacc4355502b5dfffb0c4a9a7e5f651 Desired Comments
kettle.properties Main PDI properties file; contains global variables for low-level PDI settings Shared Shared This affects all users; hence, should be managed only by admin.
shared.xml Shared objects file Dedicated <--
db.cache The database cache for metadata Dedicated <--
repositories.xml Connection details for PDI database or solution repositories Dedicated <--
.spoonrc User interface settings, including the last opened transformation/job Dedicated <--
.languageChoice Default language for the PDI client tool Shared Dedicated It takes more engineering effort to change, but one global setting may be just fine.
xulSettings.properties Not documented Dedicated <--