ContentMine / getpapers

Get metadata, fulltexts or fulltext URLs of papers matching a search query
MIT License
197 stars 37 forks source link

IEEE API throws error #175

Open sdruskat opened 5 years ago

sdruskat commented 5 years ago

Using 0.4.17 on Ubuntu 18.04 LTS with Node v12.7.0 and NPM 6.10.0 I get the error below running the following query:

getpapers -q 'cs=syracuse thsrsterms=audio' -o Output11 --api 'ieee' -n

Error:

info: Searching using ieee API
info: Running in no-execute mode, so nothing will be downloaded
events.js:180
      throw er; // Unhandled 'error' event
      ^

Error: Attribute without value
Line: 372
Column: 114
Char: >
    at error (/usr/lib/node_modules/getpapers/node_modules/xml2js/node_modules/sax/lib/sax.js:651:10)
    at strictFail (/usr/lib/node_modules/getpapers/node_modules/xml2js/node_modules/sax/lib/sax.js:677:7)
    at SAXParser.write (/usr/lib/node_modules/getpapers/node_modules/xml2js/node_modules/sax/lib/sax.js:1321:13)
    at Parser.exports.Parser.Parser.parseString (/usr/lib/node_modules/getpapers/node_modules/xml2js/lib/parser.js:322:31)
    at Parser.parseString (/usr/lib/node_modules/getpapers/node_modules/xml2js/lib/parser.js:5:59)
    at exports.parseString (/usr/lib/node_modules/getpapers/node_modules/xml2js/lib/parser.js:354:19)
    at Request.convertXML2JSON (/usr/lib/node_modules/getpapers/lib/ieee.js:76:5)
    at Request.emit (events.js:208:15)
    at Request.<anonymous> (/usr/lib/node_modules/getpapers/node_modules/requestretry/node_modules/request/request.js:1161:10)
    at Request.emit (events.js:203:13)
Emitted 'error' event at:
    at Parser.exports.Parser.Parser.parseString (/usr/lib/node_modules/getpapers/node_modules/xml2js/lib/parser.js:326:16)
    at Parser.parseString (/usr/lib/node_modules/getpapers/node_modules/xml2js/lib/parser.js:5:59)
    [... lines matching original stack trace ...]
    at Request.emit (events.js:203:13)
    at IncomingMessage.<anonymous> (/usr/lib/node_modules/getpapers/node_modules/requestretry/node_modules/request/request.js:1083:12)
    at Object.onceWrapper (events.js:291:20)
    at IncomingMessage.emit (events.js:208:15)
petermr commented 5 years ago

Thanks Stephan, I have verified that I get the same error (running on MacOSX) It's Rik Smith-Unna's code and he's better placed to comment.

My immediate suggestion would be to trap the error, log it, and continue. It's possible it's related to a specific article - e.g. one without text or title or something (there's all sorts of rubbish on target sites). Can you run it successfully with other searches? I've tried some minor tweaks without success. Do you know how many hits you get with a manual Web-based search? It's possible that zero hits might trigger this.

Also I don't have an IEEE login and it might be due to that...

On Tue, Aug 6, 2019 at 9:35 AM Stephan Druskat notifications@github.com wrote:

Using 0.4.17 on Ubuntu 18.04 LTS with Node v12.7.0 and NPM 6.10.0 I get the error below running the following query:

getpapers -q 'cs=syracuse thsrsterms=audio' -o Output11 --api 'ieee' -n

Error:

info: Searching using ieee API info: Running in no-execute mode, so nothing will be downloaded events.js:180 throw er; // Unhandled 'error' event ^

Error: Attribute without value Line: 372 Column: 114 Char: > at error (/usr/lib/node_modules/getpapers/node_modules/xml2js/node_modules/sax/lib/sax.js:651:10) at strictFail (/usr/lib/node_modules/getpapers/node_modules/xml2js/node_modules/sax/lib/sax.js:677:7) at SAXParser.write (/usr/lib/node_modules/getpapers/node_modules/xml2js/node_modules/sax/lib/sax.js:1321:13) at Parser.exports.Parser.Parser.parseString (/usr/lib/node_modules/getpapers/node_modules/xml2js/lib/parser.js:322:31) at Parser.parseString (/usr/lib/node_modules/getpapers/node_modules/xml2js/lib/parser.js:5:59) at exports.parseString (/usr/lib/node_modules/getpapers/node_modules/xml2js/lib/parser.js:354:19) at Request.convertXML2JSON (/usr/lib/node_modules/getpapers/lib/ieee.js:76:5) at Request.emit (events.js:208:15) at Request. (/usr/lib/node_modules/getpapers/node_modules/requestretry/node_modules/request/request.js:1161:10) at Request.emit (events.js:203:13) Emitted 'error' event at: at Parser.exports.Parser.Parser.parseString (/usr/lib/node_modules/getpapers/node_modules/xml2js/lib/parser.js:326:16) at Parser.parseString (/usr/lib/node_modules/getpapers/node_modules/xml2js/lib/parser.js:5:59) [... lines matching original stack trace ...] at Request.emit (events.js:203:13) at IncomingMessage. (/usr/lib/node_modules/getpapers/node_modules/requestretry/node_modules/request/request.js:1083:12) at Object.onceWrapper (events.js:291:20) at IncomingMessage.emit (events.js:208:15)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ContentMine/getpapers/issues/175?email_source=notifications&email_token=AAFTCS4UNUSZPAQLRTMISK3QDEZTJA5CNFSM4IJT3MAKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HDSFGGA, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTCSZKNV2IGU3DPFEP2K3QDEZTJANCNFSM4IJT3MAA .

-- Peter Murray-Rust Reader Emeritus in Molecular Informatics Unilever Centre, Dept. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069

sdruskat commented 5 years ago

Hi @petermr, thanks for the quick answer. I've not been able to run a single query successfully against the IEEE API.

The web interface gives me 6 results iff I supply my API key. I have also tried supplying it as part of the query, like getpapers --api ieee --query 'cs=syracuse apikey=<api-key>' --outdir out -n, but hit the same error...

petermr commented 5 years ago

We wrote it quickly some years ago for a colleague in optical systems. It's very little use to me since I only use OpenAccess. I don't know how often Rik reads this, but my guess is that IEE have changed the way the API works and this hasn't been updated. If it's critical best bet would be to develop a patch and ask Rik for a pull. I'm not aware of other IEEE users. HTH

On Tue, Aug 6, 2019 at 10:20 AM Stephan Druskat notifications@github.com wrote:

Hi @petermr https://github.com/petermr, thanks for the quick answer. I've not been able to run a single query successfully against the IEEE API.

The web interface gives me 6 results iff I supply my API key. I have also tried supplying it as part of the query, like getpapers --api ieee --query 'cs=syracuse apikey=' --outdir out -n, but hit the same error...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ContentMine/getpapers/issues/175?email_source=notifications&email_token=AAFTCS5PAGT3525B34NH7VDQDE66VA5CNFSM4IJT3MAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3UQEIQ#issuecomment-518586914, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTCS5UNP6VAD3R43XGJE3QDE66VANCNFSM4IJT3MAA .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

sdruskat commented 5 years ago

Thanks, I'll try to look into fixing this, but I've never worked with JS/Node properly, so may take time.

petermr commented 5 years ago

That's why I can't really help. Node is weakly typed event-driven and I'm a strongly typed Java person.

On Tue, Aug 6, 2019 at 11:23 AM Stephan Druskat notifications@github.com wrote:

Thanks, I'll try to look into fixing this, but I've never worked with JS/Node properly, so may take time.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ContentMine/getpapers/issues/175?email_source=notifications&email_token=AAFTCSZSJRCIUGFPOQ2IYJDQDFGIPA5CNFSM4IJT3MAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3UVYSI#issuecomment-518609993, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTCS7VJFTCPHWQASK3JX3QDFGIPANCNFSM4IJT3MAA .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK