When a dataset in GTF format is used in a remote query the following happens.
Example query:
import gmql as gl
gl.set_remote_address("http://gmql.eu/gmql-rest/")
gl.login()
gl.set_mode("remote")
d1 = gl.load_from_remote("Example_Dataset_1", owner="public")
r = d1.materialize()
The following exception is raised:
Traceback (most recent call last):
File "C:/Users/lucan/Documents/progetti_phd/PyGMQL/test/test_map.py", line 8, in <module>
r = d1.materialize()
File "C:\Users\lucan\Documents\progetti_phd\PyGMQL\gmql\dataset\GMQLDataset.py", line 1191, in materialize
return Materializations.materialize_remote(new_index, output_name, output_path, all_load)
File "C:\Users\lucan\Documents\progetti_phd\PyGMQL\gmql\dataset\loaders\Materializations.py", line 83, in materialize_remote
result = remote_manager.execute_remote_all(output_path=download_path)
File "C:\Users\lucan\Documents\progetti_phd\PyGMQL\gmql\RemoteConnection\RemoteManager.py", line 520, in execute_remote_all
return self._execute_dag(serialized_dag, output, output_path)
File "C:\Users\lucan\Documents\progetti_phd\PyGMQL\gmql\RemoteConnection\RemoteManager.py", line 553, in _execute_dag
self.download_dataset(dataset_name=name, local_path=path)
File "C:\Users\lucan\Documents\progetti_phd\PyGMQL\gmql\RemoteConnection\RemoteManager.py", line 379, in download_dataset
return self.download_as_stream(dataset_name, local_path)
File "C:\Users\lucan\Documents\progetti_phd\PyGMQL\gmql\RemoteConnection\RemoteManager.py", line 402, in download_as_stream
samples = self.get_dataset_samples(dataset_name)
File "C:\Users\lucan\Documents\progetti_phd\PyGMQL\gmql\RemoteConnection\RemoteManager.py", line 226, in get_dataset_samples
return self.process_info_list(res, "info")
File "C:\Users\lucan\Documents\progetti_phd\PyGMQL\gmql\RemoteConnection\RemoteManager.py", line 188, in process_info_list
res = pd.concat([res, pd.DataFrame.from_dict(res[info_column].map(extract_infos).tolist())], axis=1)\
File "C:\Users\lucan\Anaconda3\envs\bio\lib\site-packages\pandas\core\frame.py", line 2139, in __getitem__
return self._getitem_column(key)
File "C:\Users\lucan\Anaconda3\envs\bio\lib\site-packages\pandas\core\frame.py", line 2146, in _getitem_column
return self._get_item_cache(key)
File "C:\Users\lucan\Anaconda3\envs\bio\lib\site-packages\pandas\core\generic.py", line 1842, in _get_item_cache
values = self._data.get(item)
File "C:\Users\lucan\Anaconda3\envs\bio\lib\site-packages\pandas\core\internals.py", line 3843, in get
loc = self.items.get_loc(item)
File "C:\Users\lucan\Anaconda3\envs\bio\lib\site-packages\pandas\core\indexes\base.py", line 2527, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/_libs/index.pyx", line 117, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1265, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1273, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'info'
Process finished with exit code 1
This is the exception raised by the GMQL server (netty log):
2018-03-16 11:20:38,557 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - 18/03/16 11:20:38 ERROR GMQLSparkExecutor: empty.reduceLeft
2018-03-16 11:20:38,557 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - java.lang.UnsupportedOperationException: empty.reduceLeft
2018-03-16 11:20:38,558 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at scala.collection.TraversableOnce$class.reduceLeft(TraversableOnce.scala:180)
2018-03-16 11:20:38,558 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at scala.collection.AbstractTraversable.reduceLeft(Traversable.scala:104)
2018-03-16 11:20:38,558 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at scala.collection.TraversableOnce$class.reduce(TraversableOnce.scala:208)
2018-03-16 11:20:38,558 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at scala.collection.AbstractTraversable.reduce(Traversable.scala:104)
2018-03-16 11:20:38,558 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at it.polimi.genomics.profiling.Profilers.Profiler$.profile(Profiler.scala:147)
2018-03-16 11:20:38,558 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at it.polimi.genomics.spark.implementation.GMQLSparkExecutor$$anonfun$implementation$1.apply(GMQLSparkExecutor.scala:144)
2018-03-16 11:20:38,558 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at it.polimi.genomics.spark.implementation.GMQLSparkExecutor$$anonfun$implementation$1.apply(GMQLSparkExecutor.scala:112)
2018-03-16 11:20:38,558 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:73)
2018-03-16 11:20:38,558 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at scala.collection.mutable.MutableList.foreach(MutableList.scala:30)
2018-03-16 11:20:38,558 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at it.polimi.genomics.spark.implementation.GMQLSparkExecutor.implementation(GMQLSparkExecutor.scala:112)
2018-03-16 11:20:38,558 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at it.polimi.genomics.spark.implementation.GMQLSparkExecutor.go(GMQLSparkExecutor.scala:59)
2018-03-16 11:20:38,558 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at it.polimi.genomics.GMQLServer.GmqlServer.run(GmqlServer.scala:23)
2018-03-16 11:20:38,558 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at it.polimi.genomics.cli.GMQLExecuteCommand$.main(GMQLExecuteCommand.scala:265)
2018-03-16 11:20:38,558 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at it.polimi.genomics.cli.GMQLExecuteCommand.main(GMQLExecuteCommand.scala)
2018-03-16 11:20:38,558 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2018-03-16 11:20:38,558 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2018-03-16 11:20:38,558 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2018-03-16 11:20:38,558 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at java.lang.reflect.Method.invoke(Method.java:498)
2018-03-16 11:20:38,559 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
2018-03-16 11:20:38,559 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
2018-03-16 11:20:38,559 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
2018-03-16 11:20:38,559 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
2018-03-16 11:20:38,559 [INFO] from org.apache.spark.launcher.app.GMQLExecuteCommand in launcher-proc-17 - at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
When a dataset in GTF format is used in a remote query the following happens.
Example query:
The following exception is raised:
This is the exception raised by the GMQL server (netty log):