Closed jincongcong closed 5 years ago
std::map<std::wstring, std::vector
Hi!
You also have to use the BitArchiveInfo
class along with BitExtractor
! Using it, in fact, you can iterate over the items inside the archive, retrieve the path of the files — i.e. the keys of the map — and then use BitExtractor to extract the single files, as in the following code example:
std::map< std::wstring, std::vector< byte_t > > result;
{
std::wstring input_file = L"test.7z";
BitArchiveInfo info( lib, input_file, BitFormat::SevenZip );
BitExtractor extractor( lib, BitFormat::SevenZip );
uint32_t items_count = info.itemsCount(); //number of items (folders + files) in the archive
for ( uint32_t i = 0; i < items_count; ++i ) {
bool isDir = info.getItemProperty( i, BitProperty::IsDir ).getBool();
if ( isDir ) { continue; } //ignoring folder items
//getting the path of the file at index i
std::wstring path = info.getItemProperty( i, BitProperty::Path ).getString();
//extracting the file at index i to the corresponding buffer in the map
extractor.extract( input_file, result[ path ], i );
}
}
I hope to implement this in a more easy to use function in the next stable version!
Hi! You also have to use the
BitArchiveInfo
class along withBitExtractor
! Using it, in fact, you can iterate over the items inside the archive, retrieve the path of the files — i.e. the keys of the map — and then use BitExtractor to extract the single files, as in the following code example:std::map< std::wstring, std::vector< byte_t > > result; { std::wstring input_file = L"test.7z"; BitArchiveInfo info( lib, input_file, BitFormat::SevenZip ); BitExtractor extractor( lib, BitFormat::SevenZip ); uint32_t items_count = info.itemsCount(); //number of items (folders + files) in the archive for ( uint32_t i = 0; i < items_count; ++i ) { bool isDir = info.getItemProperty( i, BitProperty::IsDir ).getBool(); if ( isDir ) { continue; } //ignoring folder items //getting the path of the file at index i std::wstring path = info.getItemProperty( i, BitProperty::Path ).getString(); //extracting the file at index i to the corresponding buffer in the map extractor.extract( input_file, result[ path ], i ); } }
I hope to implement this in a more easy to use function in the next stable version!
Thank you! But I tried this method and it took a lot of time! About ten minutes! There are 10000+ files in the test.7z, the size of test.7z is 3.39MB.
Probably the performance problem is due to the fact that the extract method of BitExtractor is called for each file in the archive: this method opens again the archive, it calls the 7-zip DLL extraction function, and it closes the archive. As you can see, repeating this for all the 10k and more files is not optimal, but that piece of code was meant to simply implement the functionality using the current version of bit7z, without modifying its code. Implementing this kind of extraction in a new function inside BitExtractor allows to optimize the accesses to the archive, as in the following code example, in which the archive is opened only once:
void BitExtractor::extract( const wstring& in_file, map< wstring, vector< byte_t > >& out_map ) {
CMyComPtr< IInArchive > in_archive = openArchive( *this, mFormat, in_file );
uint32_t number_items;
in_archive->GetNumberOfItems( &number_items );
uint32_t indices[] = { 0 };
for ( uint32_t i = 0; i < number_items; ++i ) {
BitPropVariant propvar;
in_archive->GetProperty( i, kpidIsDir, &propvar );
if ( propvar.getBool() ) { continue; } //ignore directories
in_archive->GetProperty( i, kpidPath, &propvar ); //getting file path in the archive
auto* extract_callback_spec = new MemExtractCallback( *this, in_archive, out_map[ propvar.getString() ] );
indices[ 0 ] = i;
CMyComPtr< IArchiveExtractCallback > extract_callback( extract_callback_spec );
if ( in_archive->Extract( indices, 1, NExtract::NAskMode::kExtract, extract_callback ) != S_OK ) {
throw BitException( extract_callback_spec->getErrorMessage() );
}
}
}
If you are using a custom build of bit7z, you could add this code to BitExtractor and see if there is some performance improvement! Actually, also this code can be further optimized: it calls the Extract method of the 7-zip DLL for each single file in the archive, instead of calling it only once using an array of all the indices of the files to be extracted; however this approach would require a substantial rewrite of the MemExtractCallback class, since at the moment it takes only one vector buffer in the constructor! Anyway, as I said I hope to implement the function in this way in the next version!
Thank you very much!
I found the codes to extract a file by an index. bit7z::Bit7zLibrary lib(L"7z.dll"); bit7z::BitExtractor extractor(lib, bit7z::BitFormat::SevenZip); std::vector out_buffer;
unsigned index = 0;
extractor.extract(L"test.7z", out_buffer, index);
So how to extract all the files in the archive and it's buffer to a std::map<std::wstring, std::vector >?