pisa-engine / pisa

PISA: Performant Indexes and Search for Academia
https://pisa-engine.github.io/pisa/book
Apache License 2.0
941 stars 65 forks source link

Add option that restores the ability to perform index compression in memory #580

Open gustingonzalez opened 7 months ago

gustingonzalez commented 7 months ago

This adds the --in-memory option to the compress_inverted_index command, restoring the in-memory compression and allowing it to avoid using an intermediate buffer.

codecov[bot] commented 7 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 93.27%. Comparing base (ce97f7b) to head (e8d7d76).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #580 +/- ## ======================================= Coverage 93.27% 93.27% ======================================= Files 89 89 Lines 4448 4448 ======================================= Hits 4149 4149 Misses 299 299 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

gustingonzalez commented 7 months ago

@elshize, done with the comments, but I'll be making some minor code refactoring as well.

gustingonzalez commented 7 months ago

@elshize, I've noticed a few tests are failing. This comes from the master branch, so it might be worth considering as another issue. Here's the error:

The following tests FAILED:
      8 - test_block_codecs:Example test case - pisa::optpfor_block (ILLEGAL)
     10 - test_block_codecs:Example test case - pisa::streamvbyte_block (ILLEGAL)
     11 - test_block_codecs:Example test case - pisa::maskedvbyte_block (ILLEGAL)
     17 - test_block_codecs:Example test case - pisa::simdbp_block (ILLEGAL)
     20 - test_block_codecs:Property test - pisa::streamvbyte_block (ILLEGAL)
     21 - test_block_codecs:Property test - pisa::maskedvbyte_block (ILLEGAL)
     27 - test_block_codecs:Property test - pisa::simdbp_block (ILLEGAL)
     28 - test_block_freq_index:block_freq_index (ILLEGAL)
     29 - test_block_posting_list:block_posting_list (ILLEGAL)
     30 - test_block_posting_list:block_posting_list_reordering (ILLEGAL)
     39 - test_compress:Compress index (ILLEGAL)
     40 - test_compress:Compress quantized index (ILLEGAL)
    115 - test_stream_builder:Stream builder for block index (ILLEGAL)
elshize commented 7 months ago

Tests seem to be working fine in CI. What system are you running this on? What compiler? Can you show the output of one of the test? E.g.:

./test/test_block_codecs
gustingonzalez commented 7 months ago

Tests seem to be working fine in CI. What system are you running this on? What compiler? Can you show the output of one of the test? E.g.:

./test/test_block_codecs

@elshize, effectively there is a problem with my compiler. I tested again compiling a debug version, and it worked fine. I have to look the reasons of this, but at the moment this isn't blocking for me. The error is "ILLEGAL INSTRUCTION", I'm using g++11:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test_block_codecs is a Catch v2.13.7 host application.
Run with -? for options

-------------------------------------------------------------------------------
Example test case - pisa::optpfor_block
-------------------------------------------------------------------------------
./test/test_block_codecs.cpp:67
...............................................................................

./test/test_block_codecs.cpp:81: FAILED:
due to a fatal error condition:
  SIGILL - Illegal instruction signal

===============================================================================
test cases: 1 | 1 failed
assertions: 1 | 1 failed

[1]    20027 illegal hardware instruction (core dumped)  ./test/test_block_codecs
elshize commented 7 months ago

@gustingonzalez is it possible you previously compiled the code on one machine and executed it on a different one?

gustingonzalez commented 7 months ago

@gustingonzalez is it possible you previously compiled the code on one machine and executed it on a different one?

Great! That was the reason!