scholastica / arxiv

Ruby wrapper for the arXiv API
MIT License
24 stars 7 forks source link

Current format with five digits not supported #6

Open demonodojo opened 1 month ago

demonodojo commented 1 month ago

There are urls in arxiv that have five digits at the end. This is not currently supported Example: 2405.05966

tatums commented 1 month ago

Hey @demonodojo

The example you shared appears to be working for me. Perhaps you could provide more info?

Screenshot 2024-05-14 at 9 48 50 AM
example in text format ```ruby ❯ irb irb(main):001:0> require "./lib/arxiv" => true irb(main):002:0> manuscript = Arxiv.get("2405.05966") /Users/tatum/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/libxml-ruby-2.9.0/lib/libxml/node.rb:75: warning: undefining the allocator of T_DATA class LibXML::XML::XPath::Object /Users/tatum/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/happymapper-0.5.0/lib/happymapper/item.rb:175: warning: undefining the allocator of T_DATA class LibXML::XML::Attributes => # manuscript.title => "Natural Language Processing RELIES on Linguistics" irb(main):004:0> manuscript.abstract => "Large Language Models (LLMs) have become capable of generating highly fluent text in certain languages, without modules specially designed to capture grammar or semantic coherence. What does this mean for the future of linguistic expertise in NLP? We highlight several aspects in which NLP (still) relies on linguistics, or where linguistic thinking can illuminate new directions. We argue our case around the acronym $RELIES$ that encapsulates six major facets where linguistics contributes to NLP: $R$esources, $E$valuation, $L$ow-resource settings, $I$nterpretability, $E$xplanation, and the $S$tudy of language. This list is not exhaustive, nor is linguistics the main point of reference for every effort under these themes; but at a macro level, these facets highlight the enduring importance of studying machine systems vis-a-vis systems of human language." ```