dice-group / ida-pg

GNU Affero General Public License v3.0
6 stars 4 forks source link

New scraper and scrape format #114

Closed Cortys closed 5 years ago

Cortys commented 5 years ago

This PR includes the new configurable scraper implementation along with a scrape config for scikit-learn. The scraper can be used via a simple CLI that is documented in lib-scraper/README.md.

The main differences of the new scraper compared to the old one are:

I also included a very basic integration of the new scrape format into ida-ws to show that it works. This integration is however only preliminary, it just queries the new scrape database for the same data that was included in the old scrape already. This should be replaced by a more general code generator in the future.

codecov-io commented 5 years ago

Codecov Report

Merging #114 into master will decrease coverage by 2.88%. The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #114      +/-   ##
============================================
- Coverage     23.84%   20.96%   -2.89%     
+ Complexity       99       98       -1     
============================================
  Files            48       54       +6     
  Lines          1279     1436     +157     
  Branches        146      160      +14     
============================================
- Hits            305      301       -4     
- Misses          962     1123     +161     
  Partials         12       12
Impacted Files Coverage Δ Complexity Δ
...c/main/java/upb/ida/provider/DataDumpProvider.java 100% <100%> (ø) 5 <4> (-4) :arrow_down:
ida-ws/src/main/java/upb/ida/util/FileUtil.java 34.78% <0%> (-7.81%) 7% <0%> (+1%)
...da-ws/src/main/java/upb/ida/util/DataDumpUtil.java 78.57% <0%> (-7.15%) 4% <0%> (-1%)
...c/main/java/upb/ida/bean/cluster/ClusterParam.java 30.43% <0%> (-4.35%) 3% <0%> (ø)
...ain/java/upb/ida/bean/cluster/ClusterAlgoDesc.java 37.5% <0%> (-4.17%) 3% <0%> (-1%)
.../java/upb/ida/provider/RiveScriptBeanProvider.java 100% <0%> (ø) 3% <0%> (+1%) :arrow_up:
...s/src/main/java/upb/ida/venndiagram/VENN_ITEM.java 0% <0%> (ø) 0% <0%> (?)
...s/src/main/java/upb/ida/venndiagram/VENN_Util.java 7.14% <0%> (ø) 1% <0%> (?)
.../java/upb/ida/venndiagram/VENN_DATA_GENERATOR.java 0% <0%> (ø) 0% <0%> (?)
.../main/java/upb/ida/provider/GeoDiagramHandler.java 3.44% <0%> (ø) 1% <0%> (?)
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 7e216e7...6562311. Read the comment docs.