eu-nebulous / monitoring-data-persistor

Mozilla Public License 2.0
0 stars 1 forks source link

empty bucket in the influxdb even if metrics are flowing #5

Open jmarchel7bulls opened 1 day ago

jmarchel7bulls commented 1 day ago

There is a problem with saving data to the bucket: 2024-09-25 13:51:30,266 - DEBUG - [monitoring_eu.nebulouscloud.monitoring.realtime.>] checking if link is the same 5460d196-d6aa-40aa-8c88-9e2963f5e4b5-topic://eu.nebulouscloud.monitoring.realtime.>=5460d196-d6aa-40aa-8c88-9e2963f5e4b5-topic://eu.nebulouscloud.monitoring.realtime.> and application None=1345352509mercabarna-intralogistics1727264735013 == True 2024-09-25 13:51:30,266 - INFO - New monitoring data arrived at topic realtime.Mean_new_job_ConsumersCount

In attachment you can see picture of bucket of application Zrzut ekranu z 2024-09-25 15-55-56 also here is the link:http://localhost:8086/orgs/0a85cc5e2e223465/data-explorer?bucket=nebulous_1345352509mercabarna-intralogistics1727264735013_bucket please port-forward to nebulous-cd

For example in the LSTM i got an error: [2024-09-25 13:51:26] Could not create a prediction for some or all of the metrics for time point 1727272402.0, proceeding to next prediction time. However, 0 predictions were produced (out of the configured 8). The encountered exception trace follows: multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/usr/local/lib/python3.10/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, kwds)) File "/app/runtime/Predictor.py", line 106, in predict_attribute prediction_results = predict_with_lstm(application_state.prediction_data_filename, attribute, File "/app/lstm/lstm.py", line 39, in predict_with_lstm data_to_process = pd.read_csv(data_filename) File "/usr/local/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1026, in read_csv return _read(filepath_or_buffer, kwds) File "/usr/local/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 620, in _read parser = TextFileReader(filepath_or_buffer, kwds) File "/usr/local/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1620, in init self._engine = self._make_engine(f, self.engine) File "/usr/local/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1898, in _make_engine return mapping[engine](f, self.options) File "/usr/local/lib/python3.10/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 93, in init self._reader = parsers.TextReader(src, kwds) File "parsers.pyx", line 581, in pandas._libs.parsers.TextReader.cinit pandas.errors.EmptyDataError: No columns to parse from file """ The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/app/runtime/Predictor.py", line 232, in calculate_and_publish_predictions prediction = predict_attributes(application_state, prediction_time) File "/app/runtime/Predictor.py", line 161, in predict_attributes attribute_predictions[attribute] = attribute_predictions[attribute].get() # get the results of the processing File "/usr/local/lib/python3.10/multiprocessing/pool.py", line 774, in get raise self._value pandas.errors.EmptyDataError: No columns to parse from file

atsag commented 1 day ago

@jmarchel7bulls I can see the following log received earlier from my component:

2024-09-25 05:32:05,437 - INFO - Connecting to Url('amqp://nebulous-activemq:5672')... Exception in thread Thread-1019 (start): Traceback (most recent call last): File "/usr/local/lib/python3.11/threading.py", line 1045, in _bootstrap_inner self.run() File "/usr/local/lib/python3.11/threading.py", line 982, in run self._target(*self._args, *self._kwargs) File "/usr/local/lib/python3.11/site-packages/exn/connector.py", line 80, in start self.context.start(Manager(f"{self.url}:{self.port}"),self.handler) File "/usr/local/lib/python3.11/site-packages/exn/core/context.py", line 38, in start self._manager.start() File "/usr/local/lib/python3.11/site-packages/exn/core/manager.py", line 37, in start self.container.run() File "/usr/local/lib/python3.11/site-packages/proton/_reactor.py", line 197, in run while self.process(): ^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/proton/_reactor.py", line 262, in process event.dispatch(self._global_handler) File "/usr/local/lib/python3.11/site-packages/proton/_events.py", line 157, in dispatch _dispatch(handler, type.method, self) File "/usr/local/lib/python3.11/site-packages/proton/_events.py", line 130, in _dispatch handler.on_unhandled(method, args) File "/usr/local/lib/python3.11/site-packages/proton/_reactor.py", line 904, in on_unhandled event.dispatch(self.base) File "/usr/local/lib/python3.11/site-packages/proton/_events.py", line 157, in dispatch _dispatch(handler, type.method, self) File "/usr/local/lib/python3.11/site-packages/proton/_events.py", line 128, in _dispatch m(args) File "/usr/local/lib/python3.11/site-packages/proton/_handlers.py", line 1168, in on_reactor_quiesced readable, writable, expired = self._selector.select(r.timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/proton/_io.py", line 160, in select r, w, ex = select_inner(timeout) ^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/proton/_io.py", line 146, in select_inner return IO.select(r, w, w, timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/proton/_io.py", line 72, in select return select.select(args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ValueError: filedescriptor out of range in select()

Can you notify Fotis to debug this? Else, we need to perform the experiment again and verify the error, while we restart the component.