JanWielemaker / rocks-predicates

Put predicates into a RocksDB database
5 stars 2 forks source link

Add non-exported predicate to delete the database and related memory structures. Useful for test cases. #9

Open EricGT opened 2 years ago

EricGT commented 2 years ago

Currently for manual testing need to delete the predicates directory and halt SWI-Prolog.

Would be nice to have a non-exported predicate, (called with module name) to delete the RocksDB files and clear the memory structure created for a RocksDB instance.

kamahen commented 2 years ago

There's already this: delete_directory_and_contents/1 although it requires knowing that the RocksDB is in a directory with sub-directories, so it might make sense to wrap it in something.

JanWielemaker commented 2 years ago

Some rocks_destroy_database/1 may indeed make sense. Except for tests scripts the value seems limited though.

EricGT commented 2 years ago

In creating a working version of this realized that this should be limited to creating and destroying the RocksDB in a temporary directory. The reason being that a check can be made of the directory argument to ensure it is under current_prolog_flag(tmp_dir,Temp) which is a customary place for creating and destroying files for testing purposes.

Current values for current_prolog_flag(tmp_dir,Temp)

Ubuntu:

?- current_prolog_flag(tmp_dir,Temp).
Temp = '/tmp'.

Windows:

?- current_prolog_flag(tmp_dir,Temp).
Temp = 'C:\\Users\\Groot\\AppData\\Local\\Temp'.
EricGT commented 2 years ago

My current version of a predicate to delete the database and related memory structures.

delete_rocksdb :-
    rocks_preds:default_db(RocksDB_path),
    delete_rocksdb(RocksDB_path).

delete_rocksdb(RocksDB_path) :-
    rdb_close(RocksDB_path),
    abolish_module_tables(rocks_preds),
    current_prolog_flag(tmp_dir,Base_directory),
    (
        directory_member(Base_directory,RocksDB_path,[])
    ->
        delete_directory_and_contents(RocksDB_path)
    ;
        true
    ).

Note: This will only delete the RocksDB files if the files are located in OS defined temporary directory. Using delete_directory_and_contents/1 in production code without some form of automatic check or manual user confirmation is something I feel should be avoided.


With regards to

Except for tests scripts the value seems limited though.

One scenario where this predicate is very useful is when working with different data sets from the Prolog top level.

Normally when an end user works with data stored in an SQL database they are almost never allowed to write and submit SQL queries. However DBAs and developers often need a command line tool to access the dataset and run ad hoc queries. This is so common that for databases with an ODBC interface there is

isql, iusql - unixODBC command-line interactive SQL tool.

My current similar predicate is named repl. Yes I know the following repl predicate needs work and even a name change but is shown to give an idea of how delete_rocksdb/N is used.

repl(Rocksdb_directory) :-
    repl(Rocksdb_directory,[]).

repl(Rocksdb_directory,Options) :-
    setup_call_cleanup(
        (
            rdb_open(Rocksdb_directory,_RocksDB),
            (
                option(data(prolog(Prolog_data_path)),Options)
            ->
                (
                    option(module(M),Options)
                ->
                    rdb_load_file(M:Prolog_data_path)
                ;
                    rdb_load_file(Prolog_data_path)
                )
            ;
                (
                    option(data(csv(CSV_files)),Options)
                ->
                    (
                        option(module(M),Options)
                    ->
                        foreach(
                            member(CSV_file,CSV_files),
                            rdb_load_csv_file(M:CSV_file)
                        )
                    ;
                        foreach(
                            member(CSV_file,CSV_files),
                            rdb_load_csv_file(CSV_file)
                        )
                    )
                ;
                    true
                )
            )
        ),
        (
            format('REPL for RocksDB: ~w~n',[Rocksdb_directory]),
            break
        ),
        (
            format('Closing RocksDB: ~w~n',[Rocksdb_directory]),
            (
                option(delete(true),Options)
            ->
                delete_rocksdb(Rocksdb_directory)
            ;
                true
            )
        )
    ).

This repl/N allows me to load a RocksDB with data and access the data via interop predicates based on rdb_clause. Then to change to a different dataset, pressing Ctrl-D will reset the state and files such that repl/N can be run again without having to halt/0 an SWI-Prolog toplevel. It really is handy.

The entire process has not been thoroughly vetted but is something I see that many developers will desire once they know it can be done.

kamahen commented 2 years ago

Most modern SQL databases have CREATE TABLE IF NOT EXISTS and DROP TABLE IF EXISTS or similar. These are dangerous, but they'e also very useful for doing tests (DROP TABLE is dangerous in general because it typically can't be rolled back). I don't see any reason to check for "/tmp", although it might be nice to have additional "create" and "delete" predicates specifically designed for testing (using the tmp_dir flag).