Open meshiguge opened 6 years ago
here I want to split warc file to small chunks and then use multiprocessing in python
multiprocessing
for text file, we can use seeks, but how to seek in warc module or .gz warc files ?? any advices ?
seeks
You can open it as gzip file and perform seek, then from there you can pass the file pointer to WARCReader
here I want to split warc file to small chunks and then use
multiprocessing
in pythonfor text file, we can use
seeks
, but how to seek in warc module or .gz warc files ?? any advices ?