wsdm-cup-2017 / riberry

The Riberry Vandalism Detector
0 stars 0 forks source link

Compressor change from *.bz2 to *.7z #1

Open XuYiwen opened 8 years ago

XuYiwen commented 8 years ago

Compressor used in: wsdmcup17-wdvd-baseline-feature-extraction/src/main/java/de/upb/wdqa/wdvd/processors/output/CsvFeatureWriter.java

import org.apache.commons.compress.compressors....

This is from Apache Commons Compress™.

In order to change into 7z, follow examples here: https://commons.apache.org/proper/commons-compress/examples.html

yutuofish2 commented 8 years ago

CsvFeatureWriter.java is used to output bz2 files.

The files to be modified in order to read 7z filies directly: FeatureExtractor.java CorpusLabelProcessor.java TagDownloader.java GeolocationDatabase.java

I have modified FeatureExtractor.java and uploaded it as a branch. One test file is also modified to pass Maven test.

Note that this only works on commons-compress-1.12.jar. Update the Maven dependencies if necessary.