benchflow / data-transformers

Spark scripts utilised to transform data to the BenchFlow internal formats
Other
0 stars 0 forks source link

Mysql Transformer Error In Retrieving Configuration File #86

Closed VincenzoFerme closed 7 years ago

VincenzoFerme commented 7 years ago

An execution log reporting the error as in the title. Moreover in case of an error, the computation must not be marked as succeeded as it seems to be marked from the following logs.

8/21/2016 6:11:18 PMReceived message: {"minio_key":"eaa2/BenchFlow.wfmsTest.1.1/BenchFlow.wfmsTest.1.1.1/benchflow.collector.mysql.db_BenchFlow.wfmsTest.1.1.1/mysql/mysql/process-engine","trial_id":"BenchFlow.wfmsTest.1.1.1","experiment_id":"BenchFlow.wfmsTest.1.1","container_id":"mysql","container_name":"mysql","host_id":"host","collector_name":"mysql"}
8/21/2016 6:11:18 PMmysql topic, submitting script data-transformers/transformers/mysqlTransformer.py, minio location: eaa2/BenchFlow.wfmsTest.1.1/BenchFlow.wfmsTest.1.1.1/benchflow.collector.mysql.db_BenchFlow.wfmsTest.1.1.1/mysql/mysql/process-engine, trial id: BenchFlow.wfmsTest.1.1.1
8/21/2016 6:11:18 PMTransformer work request queued for script mysql, camunda, 7.5.0, BenchFlow.wfmsTest.1.1.1, mysql, host, trial
8/21/2016 6:11:18 PMReceived work request for script mysql, camunda, 7.5.0, BenchFlow.wfmsTest.1.1.1, mysql, host, trial
8/21/2016 6:11:18 PMDispatching work request for script mysql, camunda, 7.5.0, BenchFlow.wfmsTest.1.1.1, mysql, host, trial
8/21/2016 6:11:18 PMTransformer worker submitting script:  [--master local[*] --jars /usr/spark/pyspark-cassandra-assembly-0.3.5.jar --driver-memory 12g --executor-memory 12g --conf spark.executor.heartbeatInterval=300000s --conf spark.storage.blockManagerSlaveTimeoutMs=300000s --conf spark.core.connection.ack.wait.timeout=300000s --conf spark.cassandra.connection.host=cassandra --driver-class-path /usr/spark/pyspark-cassandra-assembly-0.3.5.jar --py-files /app/data-transformers/commons/commons.py,/app/data-transformers/transformations/dataTransformations.py,/usr/spark/pyspark-cassandra-assembly-0.3.5.jar /app/data-transformers/transformers/mysqlTransformer.py {"cassandra_keyspace":"benchflow","minio_host":"minio","minio_port":"9000","minio_access_key":"CYNQML6R7V12MTT32W6P","minio_secret_key":"SQ96V5pg02Z3kZ/0ViF9YY6GwWzZvoBmElpzEEjn","file_bucket":"runs","file_path":"eaa2/BenchFlow.wfmsTest.1.1/BenchFlow.wfmsTest.1.1.1/benchflow.collector.mysql.db_BenchFlow.wfmsTest.1.1.1/mysql/mysql/process-engine","trial_id":"BenchFlow.wfmsTest.1.1.1","experiment_id":"BenchFlow.wfmsTest.1.1","config_file":"data-transformers.configuration.yml","container_id":"mysql","host_id":"host"}]
8/21/2016 6:11:18 PM/apps/bin:/usr/local/bin:/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/sbin
8/21/2016 6:11:22 PMTraceback (most recent call last):
8/21/2016 6:11:22 PM  File "/app/data-transformers/transformers/mysqlTransformer.py", line 196, in <module>
8/21/2016 6:11:22 PM    main()
8/21/2016 6:11:22 PM  File "/app/data-transformers/transformers/mysqlTransformer.py", line 140, in main
8/21/2016 6:11:22 PM    with open(confPath) as f:
8/21/2016 6:11:22 PMIOError: [Errno 2] No such file or directory: u'/tmp/spark-1f5c7d2d-5bd1-4645-accf-6511a3dddd34/userFiles-47d4f8d2-27ee-4f7e-88d5-66cd1f15a41f/data-transformers.configuration.yml'
8/21/2016 6:11:22 PM
8/21/2016 6:11:22 PMScript data-transformers/transformers/mysqlTransformer.py processed
8/21/2016 6:11:22 PMTransformer script completed successfully: mysql, camunda, 7.5.0, BenchFlow.wfmsTest.1.1.1, mysql, host, trial
8/21/2016 6:11:22 PMAll requirements met for: mysql

Please test also all the other transformers again in the current state of the system. And add test cases for all the encountered problems.

Cerfoglg commented 7 years ago

@VincenzoFerme I think the issue here is not the script. Notice how in our sut plugins repo the folder for wfms is in all lowercaps? https://github.com/benchflow/sut-plugins

in the benchmark configuration files, the field for SUT type, how is wfms written? Because if it's WfMS, meaning with some capital letters, then the scheduler won't find the right configuration because it expects wfms all lowercase, as taken from the sut plugins repo.

simonedavico commented 7 years ago

@Cerfoglg in the configuration file capitalisation doesn't matter. One could write all possible lowercase/uppercase combinations of the letters w,f,m,s. This is because the correct way to shorten "Workflow Management Systems" is WfMS in the literature :)

It's better if you don't assume lowercase or uppercase when you read field values from test configuration files. For sanity, when you read a field, convert it to all lowercase first (should be just a method call), and then compare its value.

Cerfoglg commented 7 years ago

@VincenzoFerme @simonedavico Was worried that capitalisation could have been important at some point, but either way I applied a fix to that https://github.com/benchflow/data-analyses-scheduler/pull/83