Open philsv opened 3 months ago
I can confirm this bug coming up for Windows 10, Windows 11 and WSL2.
It seems like a race condition of some sort where there might be a use after free. It is not reliable reproducible.
My test case would be just:
adb = arcticdb.Arctic("lmdb://test.db") if adb.has_library("test"): adb.delete_library("test") alib = adb.create_library("test") df_1 = pd.DataFrame({'column': [1.0, 2.0, 3.0]}) #df_1 = pd.DataFrame({'column': [1, 2, 3]}) payload_1 = arcticdb.WritePayload("symbol_1", df_1, metadata={'the': 'metadata'}) alib.write("symbol_1", df_1)
Depending on the choice for 'df_1' errors are different. Assert like above for float but a floating point exception for int.
Float
File "/<...>/trader/.venv/lib/python3.11/site-packages/arcticdb/version_store/library.py", line 455, in write return self._nvs.write( ^^^^^^^^^^^^^^^^ File "/<...>/trader/.venv/lib/python3.11/site-packages/arcticdb/version_store/_store.py", line 587, in write vit = self.version_store.write_versioned_dataframe( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
arcticdb_ext.exceptions.InternalException: E_ASSERTION_FAILURE Invalid dtype 3:1 - 'UNKNOWN' in visit dim
Int
Floating point exception
Modules: deps.txt
Maybe valgrind or ASAN can provide indications?
Debug build of 4.4.2 with gcc ASAN enabled without issues.
Probably unrelated but running under valgrind yields:
Traceback (most recent call last): File "/<...>/trader/broken.py", line 30, in
exit(main()) ^^^^^^ File "/<...>/trader/broken.py", line 14, in main adb.delete_library("test") File "/<...>/trader/venv/lib/python3.11/site-packages/arcticdb/arctic.py", line 232, in delete_library lib = self[name] File "/<...>/trader/venv/lib/python3.11/site-packages/arcticdb/arctic.py", line 99, in __getitem__ self._library_manager.get_library(lib_mgr_name, storage_override), ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ arcticdb_ext.exceptions.InternalException: lmdb::runtime_error(mdb_env_open: Invalid argument)
@jankleinsorge anything I could do to prevent the E_ASSERTION_FAILURE within my pipeline running on debian 11.9?
@philsv I am just some random user. I guess the next best thing is downgrade and see how that turns out.
Btw, a release (default) built with gcc12 fails with:
ArcticDB/cpp/arcticdb/pipeline/frame_slice.hpp:70:8: error: ‘(arcticdb::pipelines::FrameSlice)((char*)&
+ offsetof(folly::Try<std::tuple<arcticdb::stream::StreamSink::PartialKey, arcticdb::SegmentInMemory, arcticdb::pipelines::FrameSlice> >,folly::Try<std::tuple<arcticdb::stream::StreamSink::PartialKey, arcticdb::SegmentInMemory, arcticdb::pipelines::FrameSlice> >:: .folly::detail::TryBase<std::tuple<arcticdb::stream::StreamSink::PartialKey, arcticdb::SegmentInMemory, arcticdb::pipelines::FrameSlice> >:: )).arcticdb::pipelines::FrameSlice::row_range’ may be used uninitialized [-Werror=maybe-uninitialized]
@jankleinsorge, I was able to repro the issue you reported on 4.4.2
. The issue does not happen on 4.4.3
. Can you try that out?
@muhammadhamzasajjad I gave the 4.4.3 pip release a try and I did not encounter the issue since then. But then again I could not reliably reproduce it so I cannot tell whether it is has been really fixed. Let me know if I can lend a hand here.
Thanks @jankleinsorge. I was able to repro it on 4.4.2
every single time. And I didn't see it on 4.4.3
. I'd recommend using 4.4.3
for now. We are going to investigate the issue further.
@philsv, I couldn't repro the issue you reported. It is hard to investigate it further without knowing what you are trying to append/write. Perhaps sharing the structure of your existing dataframe, remain_df
and df
(from you code) might help.
Hi @muhammadhamzasajjad,
I have switched to work with update
on version 4.4.3
and the error was gone for me mentioned here: https://github.com/man-group/ArcticDB/issues/1627
I will test append
again on 4.4.3
if the error still persists there, and let you know.
Heres how my code looks like:
...
existing_df = library.read(symbol).data
# Append the new data to the existing df
append_df = pd.concat([remaining_df, existing_df])
append_df = append_df[~append_df.index.duplicated(keep="last")]
if not append_df.index.is_monotonic_increasing:
append_df = append_df.sort_index(ascending=True)
try:
result = library.append(
symbol,
append_df,
prune_previous_versions=prune_previous_versions,
)
except Exception as e:
return self.rewrite_to_db(lib, symbol, df, library) # deletes the symbol and rewrites it to the database
...
this is how my remaining_df looks like:
CBOE Volatility Index
2024-08-01 13.3
remaining_df.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1 entries, 2024-08-01 to 2024-08-01
Data columns (total 1 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 CBOE Volatility Index 1 non-null float64
dtypes: float64(1)
memory usage: 16.0 bytes
This is how my test went:
+ python3.11 -m tests.arctic.test_db
2024-07-02 23:16:57 - Dataframe cboe_volatility_index already up to date in the database test: 1990-01-02 00:00:00 - 2024-07-01 00:00:00
2024-07-02 23:17:03 - cboe_volatility_index existing df (before append) info:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 8702 entries, 1990-01-02 to 2024-07-01
Data columns (total 1 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 CBOE Volatility Index 8702 non-null float64
dtypes: float64(1)
memory usage: 136.0 KB
2024-07-02 23:17:03 - cboe_volatility_index processed df (after append) info:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 8703 entries, 1990-01-02 to 2024-08-01
Data columns (total 1 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 CBOE Volatility Index 8703 non-null float64
dtypes: float64(1)
memory usage: 136.0 KB
20240702 23:17:09.266602 15696 E arcticdb.root | E_ASSERTION_FAILURE Can't append dataframe with start index 1990-01-02 00:00:00.0 to existing sequence ending at 2024-07-01 00:00:00.0
2024-07-02 23:17:10 - Rewritten dataframe cboe_volatility_index to the database test: 1990-01-02 00:00:00 - 2024-08-01 00:00:00
df.tail(5)
CBOE Volatility Index
2024-06-26 12.55
2024-06-27 12.24
2024-06-28 12.44
2024-07-01 12.22
2024-08-01 12.30
So with 4.4.3
theres still some issue I don't really understand but it was hitting my try block now in the function and I was able to rewrite the data afterwards.
For now I will stick to library.update as I encounter no problems with that.
This seems like the same underlying issue as #1589 (sporadic, spurious rejections of legitimate appends).
Describe the bug
I have the following error arcticdb_ext.exceptions.InternalException: E_ASSERTION_FAILURE Invalid dtype 'UNKNOWN' in visit dim. Reverting back to older version brings no resolution to the issue. I have tried to delete the symbol on Exeception and write it again to db and still get a arcticdb_ext.exceptions.InternalException on another write attempt.
Though I have no problem executing the code locally on my windows machine or on WSL2, but it does not work in my bitbucket pipeline.
bitbucket pipeline stdout:
Another try (haven't changed any types):
Steps/Code to Reproduce
Expected Results
Does not throw an "UNKNOWN"
OS, Python Version and ArcticDB Version
Python 3.11.3 | Docker Image: python:3.11.3-slim-bullseye using bitbucket-pipelines.yml | OS: Debian-11 ArcticDB: 4.4.2 - 4.0.0
Backend storage used
AWS S3
Additional Context
No response