Open alanz opened 10 years ago
I guess that's what PRs are for and they could get accepted or not or could lead to some discussion. Could I take a look at them already now?
Its a work in progress, currently in my private project. Will generate a PR when it stabilises.
On Wed, Oct 22, 2014 at 9:38 AM, Kirill Zaborsky notifications@github.com wrote:
I guess that's what PRs are for and they could get accepted or not or could lead to some discussion. Could I take a look at them already now?
— Reply to this email directly or view it on GitHub https://github.com/qrilka/xlsx/issues/15#issuecomment-60047140.
But probably some small gist with their API? I any case thanks.
parseSheparseSheet :: Worksheet -> [DetailLine2]
parseSheet sh = catMaybes $ map (parseRow sh) [3 .. 3]
parseRow :: Worksheet -> RowNum -> Maybe DetailLine2
parseRow sh row = r
`debug` ("parseRow:cells= " ++ show cells)
where
cells = map (\col -> cellsh sh (row,col)) [1..11]
r = case parseDl cells of
Left err -> Nothing `debug` ( "parseRow " ++ show row ++ ":" ++ show
err)
Right dl -> Just dl
-- parse :: Stream s Identity t => Parsec s () a -> SourceName -> s ->
Either ParseError a
parseDl :: [Maybe CellValue] -> Either ParseError DetailLine2
parseDl ss = parse p "source" ss
type P a = Parsec [Maybe CellValue] () a
p :: P DetailLine2
p = pDLHeading
pDLHeading :: P DetailLine2
pDLHeading = do
many1 pEmpty
name <- pText
many1 pEmpty
pLabel "Date: Statement For: "
pEmpty
date <- pNumber
return (DLHeading name date)
-- | Return the text from a cell
pText :: P T.Text
pText = tokenPrim show nextPos getMaybeText
where
nextPos pos _ _ = incSourceColumn pos 1
getMaybeText mt = case mt of
Just (CellText str) -> Just (T.fromStrict str)
_ -> Nothing
-- | Return the value of a cell
pNumber :: P Rational
pNumber = tokenPrim show nextPos getMaybeNumber
where
nextPos pos _ _ = incSourceColumn pos 1
getMaybeNumber mt = case mt of
Just (CellDouble d) -> Just (double2Rational d)
_ -> Nothing
-- | Parse an empty cell
pEmpty :: P ()
pEmpty = tokenPrim show nextPos getMaybeCell
where
nextPos pos _ _ = incSourceColumn pos 1
getMaybeCell mt = case mt of
Just _ -> Nothing
_ -> Just ()
-- | Match a cell with a specific label
pLabel :: T.Text -> P ()
pLabel label = tokenPrim show nextPos matchText
where
nextPos pos _ _ = incSourceColumn pos 1
matchText mt = case mt of
Just (CellText str) -> if label == T.fromStrict str
then Just ()
else Nothing
_ -> Nothing
et :: Worksheet -> [DetailLine2]
parseSheet sh = catMaybes $ map (parseRow sh) [3 .. 3]
parseRow :: Worksheet -> RowNum -> Maybe DetailLine2
parseRow sh row = r
`debug` ("parseRow:cells= " ++ show cells)
where
cells = map (\col -> cellsh sh (row,col)) [1..11]
r = case parseDl cells of
Left err -> Nothing `debug` ( "parseRow " ++ show row ++ ":" ++ show
err)
Right dl -> Just dl
-- parse :: Stream s Identity t => Parsec s () a -> SourceName -> s ->
Either ParseError a
parseDl :: [Maybe CellValue] -> Either ParseError DetailLine2
parseDl ss = parse p "source" ss
type P a = Parsec [Maybe CellValue] () a
p :: P DetailLine2
p = pDLHeading
pDLHeading :: P DetailLine2
pDLHeading = do
many1 pEmpty
name <- pText
many1 pEmpty
pLabel "Date: Statement For: "
pEmpty
date <- pNumber
return (DLHeading name date)
-- | Return the text from a cell
pText :: P T.Text
pText = tokenPrim show nextPos getMaybeText
where
nextPos pos _ _ = incSourceColumn pos 1
getMaybeText mt = case mt of
Just (CellText str) -> Just (T.fromStrict str)
_ -> Nothing
-- | Return the value of a cell
pNumber :: P Rational
pNumber = tokenPrim show nextPos getMaybeNumber
where
nextPos pos _ _ = incSourceColumn pos 1
getMaybeNumber mt = case mt of
Just (CellDouble d) -> Just (double2Rational d)
_ -> Nothing
-- | Parse an empty cell
pEmpty :: P ()
pEmpty = tokenPrim show nextPos getMaybeCell
where
nextPos pos _ _ = incSourceColumn pos 1
getMaybeCell mt = case mt of
Just _ -> Nothing
_ -> Just ()
-- | Match a cell with a specific label
pLabel :: T.Text -> P ()
pLabel label = tokenPrim show nextPos matchText
where
nextPos pos _ _ = incSourceColumn pos 1
matchText mt = case mt of
Just (CellText str) -> if label == T.fromStrict str
then Just ()
else Nothing
_ -> Nothing
Github is a bit strange it did not use markdown for you email I guess, your message should look like this I think:
parseSheparseSheet :: Worksheet -> [DetailLine2]
parseSheet sh = catMaybes $ map (parseRow sh) [3 .. 3]
parseRow :: Worksheet -> RowNum -> Maybe DetailLine2
parseRow sh row = r
`debug` ("parseRow:cells= " ++ show cells)
where
cells = map (\col -> cellsh sh (row,col)) [1..11]
r = case parseDl cells of
Left err -> Nothing `debug` ( "parseRow " ++ show row ++ ":" ++ show
err)
Right dl -> Just dl
-- parse :: Stream s Identity t => Parsec s () a -> SourceName -> s ->
Either ParseError a
parseDl :: [Maybe CellValue] -> Either ParseError DetailLine2
parseDl ss = parse p "source" ss
type P a = Parsec [Maybe CellValue] () a
p :: P DetailLine2
p = pDLHeading
pDLHeading :: P DetailLine2
pDLHeading = do
many1 pEmpty
name <- pText
many1 pEmpty
pLabel "Date: Statement For: "
pEmpty
date <- pNumber
return (DLHeading name date)
-- | Return the text from a cell
pText :: P T.Text
pText = tokenPrim show nextPos getMaybeText
where
nextPos pos _ _ = incSourceColumn pos 1
getMaybeText mt = case mt of
Just (CellText str) -> Just (T.fromStrict str)
_ -> Nothing
-- | Return the value of a cell
pNumber :: P Rational
pNumber = tokenPrim show nextPos getMaybeNumber
where
nextPos pos _ _ = incSourceColumn pos 1
getMaybeNumber mt = case mt of
Just (CellDouble d) -> Just (double2Rational d)
_ -> Nothing
-- | Parse an empty cell
pEmpty :: P ()
pEmpty = tokenPrim show nextPos getMaybeCell
where
nextPos pos _ _ = incSourceColumn pos 1
getMaybeCell mt = case mt of
Just _ -> Nothing
_ -> Just ()
-- | Match a cell with a specific label
pLabel :: T.Text -> P ()
pLabel label = tokenPrim show nextPos matchText
where
nextPos pos _ _ = incSourceColumn pos 1
matchText mt = case mt of
Just (CellText str) -> if label == T.fromStrict str
then Just ()
else Nothing
_ -> Nothing
et :: Worksheet -> [DetailLine2]
parseSheet sh = catMaybes $ map (parseRow sh) [3 .. 3]
parseRow :: Worksheet -> RowNum -> Maybe DetailLine2
parseRow sh row = r
`debug` ("parseRow:cells= " ++ show cells)
where
cells = map (\col -> cellsh sh (row,col)) [1..11]
r = case parseDl cells of
Left err -> Nothing `debug` ( "parseRow " ++ show row ++ ":" ++ show
err)
Right dl -> Just dl
-- parse :: Stream s Identity t => Parsec s () a -> SourceName -> s ->
Either ParseError a
parseDl :: [Maybe CellValue] -> Either ParseError DetailLine2
parseDl ss = parse p "source" ss
type P a = Parsec [Maybe CellValue] () a
p :: P DetailLine2
p = pDLHeading
pDLHeading :: P DetailLine2
pDLHeading = do
many1 pEmpty
name <- pText
many1 pEmpty
pLabel "Date: Statement For: "
pEmpty
date <- pNumber
return (DLHeading name date)
-- | Return the text from a cell
pText :: P T.Text
pText = tokenPrim show nextPos getMaybeText
where
nextPos pos _ _ = incSourceColumn pos 1
getMaybeText mt = case mt of
Just (CellText str) -> Just (T.fromStrict str)
_ -> Nothing
-- | Return the value of a cell
pNumber :: P Rational
pNumber = tokenPrim show nextPos getMaybeNumber
where
nextPos pos _ _ = incSourceColumn pos 1
getMaybeNumber mt = case mt of
Just (CellDouble d) -> Just (double2Rational d)
_ -> Nothing
-- | Parse an empty cell
pEmpty :: P ()
pEmpty = tokenPrim show nextPos getMaybeCell
where
nextPos pos _ _ = incSourceColumn pos 1
getMaybeCell mt = case mt of
Just _ -> Nothing
_ -> Just ()
-- | Match a cell with a specific label
pLabel :: T.Text -> P ()
pLabel label = tokenPrim show nextPos matchText
where
nextPos pos _ _ = incSourceColumn pos 1
matchText mt = case mt of
Just (CellText str) -> if label == T.fromStrict str
then Just ()
else Nothing
_ -> Nothing
yep
On Wed, Oct 22, 2014 at 11:22 AM, Kirill Zaborsky notifications@github.com wrote:
Github is a bit strange it did not use markdown for you email I guess, your message should look like this I think:
parseSheparseSheet :: Worksheet -> [DetailLine2]parseSheet sh = catMaybes $ map (parseRow sh) [3 .. 3] parseRow :: Worksheet -> RowNum -> Maybe DetailLine2parseRow sh row = r
debug
("parseRow:cells= " ++ show cells) where cells = map (\col -> cellsh sh (row,col)) [1..11] r = case parseDl cells of Left err -> Nothingdebug
( "parseRow " ++ show row ++ ":" ++ showerr) Right dl -> Just dl -- parse :: Stream s Identity t => Parsec s () a -> SourceName -> s ->Either ParseError a parseDl :: [Maybe CellValue] -> Either ParseError DetailLine2parseDl ss = parse p "source" ss type P a = Parsec [Maybe CellValue]() a p :: P DetailLine2p = pDLHeading pDLHeading :: P DetailLine2pDLHeading = do many1 pEmpty name <- pText many1 pEmpty pLabel "Date: Statement For: " pEmpty date <- pNumber return (DLHeading name date) -- | Return the text from a cellpText :: P T.TextpText = tokenPrim show nextPos getMaybeText where nextPos pos = incSourceColumn pos 1getMaybeText mt = case mt of Just (CellText str) -> Just (T.fromStrict str) _ -> Nothing
-- | Return the value of a cellpNumber :: P RationalpNumber = tokenPrim show nextPos getMaybeNumber where nextPos pos = incSourceColumn pos 1
getMaybeNumber mt = case mt of Just (CellDouble d) -> Just (double2Rational d) _ -> Nothing
-- | Parse an empty cellpEmpty :: P ()pEmpty = tokenPrim show nextPos getMaybeCell where nextPos pos = incSourceColumn pos 1
getMaybeCell mt = case mt of Just _ -> Nothing _ -> Just ()
-- | Match a cell with a specific labelpLabel :: T.Text -> P ()pLabel label = tokenPrim show nextPos matchText where nextPos pos = incSourceColumn pos 1
matchText mt = case mt of Just (CellText str) -> if label == T.fromStrict str then Just () else Nothing _ -> Nothinget :: Worksheet -> [DetailLine2]parseSheet sh = catMaybes $ map (parseRow sh) [3 .. 3]
parseRow :: Worksheet -> RowNum -> Maybe DetailLine2parseRow sh row = r
debug
("parseRow:cells= " ++ show cells) where cells = map (\col -> cellsh sh (row,col)) [1..11] r = case parseDl cells of Left err -> Nothingdebug
( "parseRow " ++ show row ++ ":" ++ showerr) Right dl -> Just dl -- parse :: Stream s Identity t => Parsec s () a -> SourceName -> s ->Either ParseError a parseDl :: [Maybe CellValue] -> Either ParseError DetailLine2parseDl ss = parse p "source" ss type P a = Parsec [Maybe CellValue]() a p :: P DetailLine2p = pDLHeading pDLHeading :: P DetailLine2pDLHeading = do many1 pEmpty name <- pText many1 pEmpty pLabel "Date: Statement For: " pEmpty date <- pNumber return (DLHeading name date) -- | Return the text from a cellpText :: P T.TextpText = tokenPrim show nextPos getMaybeText where nextPos pos = incSourceColumn pos 1getMaybeText mt = case mt of Just (CellText str) -> Just (T.fromStrict str) _ -> Nothing
-- | Return the value of a cellpNumber :: P RationalpNumber = tokenPrim show nextPos getMaybeNumber where nextPos pos = incSourceColumn pos 1
getMaybeNumber mt = case mt of Just (CellDouble d) -> Just (double2Rational d) _ -> Nothing
-- | Parse an empty cellpEmpty :: P ()pEmpty = tokenPrim show nextPos getMaybeCell where nextPos pos = incSourceColumn pos 1
getMaybeCell mt = case mt of Just _ -> Nothing _ -> Just ()
-- | Match a cell with a specific labelpLabel :: T.Text -> P ()pLabel label = tokenPrim show nextPos matchText where nextPos pos = incSourceColumn pos 1
matchText mt = case mt of Just (CellText str) -> if label == T.fromStrict str then Just () else Nothing _ -> Nothing
— Reply to this email directly or view it on GitHub https://github.com/qrilka/xlsx/issues/15#issuecomment-60057775.
Sorry to revive this already quite old issue, but what is the preferred way of parsing spreadsheets? Obviously, the traditional stream-based parsers are a bit limited, since xlsx
already provides us with a nice CellMap
we can traverse at will. If for some reason we were to shoe-horn a Worksheet
into a stream-based parser, then the first problem is to define a suitable stream type. My first attempt is something like
data XlsToken = EndOfRow | C Cell
fromSheet :: CellMap -> [XlsToken]
instance Stream [XlsToken] where
(see also this comment)
Forgoing the stream approach, we could use as a parser monad ReaderT CellMap (Either ParseError)
which allows us to freely jump across the sheet as we see fit. The only compelling reason to use the stream-based approach is because we can build on libraries with excellent error reporting, like Megaparsec.
If stream-based spreadsheet parsing turns out to be of general interest, I could release a xlsx-megaparsec library. I think this has no place in the xlsx library itself.
I am writing some Parsec combinators for my own use on top of this.
Do you want a pull request for them when I am done? I am not sure if they belong in this library.