Closed vijaynaidu closed 7 years ago
This is not how it works. This is open source. In general, when you ask for help, I would expect that you show me the code that you've written (your attempt) and tell me what you expect. I'd then point out where we have different assumptions.
I currently don't have the time to help you with this. Maybe try Stack Overflow? Also: The examples directory here has quite a bit of parslet code for you to peruse. Good luck!
@kschiess Sorry, i apologise for my mistake :( :+1: Thanks for your reply. Sure, would try to get help from other sources
Hope someone might get helpful from this piece that i tried I'm applying Parslet for parsing text and it works cool for segmenting page nos i.e CASE 2. But no idea on how to do the same with CASE 1 i.e parsing title
page = 'pp. S170–S177.'
class PageParse < Parslet::Parser
root(:page_exp)
rule(:space) { match('\s').repeat(1) }
rule(:space?) { space.maybe }
rule(:dot) { str('.').repeat(1) }
rule(:dot?) { dot.maybe }
rule(:comma) { str(',').repeat(1) }
rule(:comma?) { comma.maybe }
rule(:alphabet) { match('[A-Za-z]').repeat(1) }
rule(:alphabet?) { alphabet.maybe }
rule(:integer) { match('[0-9]').repeat(1) }
rule(:integer?) { integer.maybe }
rule(:alpha_numeric) { (alphabet | integer).repeat(1) }
rule(:alpha_numeric?) { alpha_numeric.maybe }
rule(:page_label_names){ str('page') | str('pp') | str('p') }
rule(:page_label_names?){ page_label_names.maybe }
rule(:page_label){ space? >> page_label_names? >> dot? >> space? }
rule(:page_end_boundary){ space? >> comma? >> dot? >> space? }
rule(:page_end_boundary?){ page_end_boundary.maybe }
rule(:page_no){ alpha_numeric }
rule(:page_no?){ page_no.maybe }
rule(:page_seperator){ str('-').repeat(1) | str('–').repeat(1) }
rule(:page_seperator?){ page_seperator.maybe }
rule(:page_content){ page_no?.as(:first_page) >> page_seperator?.as(:separator) >> page_no?.as(:last_page) }
rule(:page_exp){ page_label.maybe.as(:match_pre) >> page_content.maybe.as(:pages) >> page_end_boundary.as(:match_post) }
end
def parse(page)
PageParse.new.parse(page)
rescue Parslet::ParseFailed => failure
#return page
puts failure.parse_failure_cause.ascii_tree
end
pp parse(page)
Hi,
I've taken a quick look after all. I have a hard time understanding the syntax that underlies this 'title' thing. Apparently, it nests '"' without escaping, so a parser would have to keep reading balanced '"' until it finds the last one in the document? Maybe your difficulty in parsing this comes from the underlying grammar being underdefined.
Maybe that helps? kaspar
Hi @kschiess Thanks for the cool plugin. I'm trying to understand how actually to use Parslet for my case. Can you help me with the idea/ syntax on the following cases.
CASE1: Input:
Expected output:
CASE2:
Expected output:
Thanks