Open larsgw opened 8 years ago
Agreed this is a bug. It may be fixed in the dev branch...
On Sun, May 29, 2016 at 11:19 AM, larsgw notifications@github.com wrote:
When using the standard AMI command, used in the tutorial (ami2-word --project PROJECTNAME -i scholarly.html --w.words wordFrequencies --w.stopwords stopwords.txt), where stopwords.txt is copied from the ami2-0.1-SNAPSHOT.jar, the outputted data contains parts of CSS found in the style tag in scholarly.html.
Example of the output: [image: header28] https://cloud.githubusercontent.com/assets/14018963/15632566/f80289f0-2596-11e6-98b2-b677ff9d5abe.png
Input was the scholarly.html created with norma from PMC4350396.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ContentMine/ami/issues/56, or mute the thread https://github.com/notifications/unsubscribe/AAsxS_ybLscb1AtSDIzZpoaNRIlI6LvLks5qGWgkgaJpZM4IpQkj .
Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069
When using the standard AMI command, used in the tutorial (
ami2-word --project PROJECTNAME -i scholarly.html --w.words wordFrequencies --w.stopwords stopwords.txt
), where stopwords.txt is copied from theami2-0.1-SNAPSHOT.jar
, the outputted data contains parts of CSS found in the style tag in scholarly.html.Example of the output:
Input was the scholarly.html created with
norma
from PMC4350396.