Closed mguinness closed 5 years ago
Sure ... if you make it optional
I just released a new package https://www.nuget.org/packages/IFilterTextReader/1.6.1
New package works great, thanks!
@Sicos1977 and @mguinness the only problem with this is that it's possible to have meta data properties are duplicated e.g.
Names: foo Names: bar
In this scenario, the dictionary generates a key already exists exception. I'll log a separate issue for this also
Out of interest what is the output of filtdump of an example file? I imagine the tags are coming from different sections in the file. Changing the field type to List<KeyValuePair<string, object>>
would work.
@mguinness - sorry, I didn't rush back to this - in this case it's the same section, but the 'different sections' is also a problem
CHUNK: --------------------------------------------------------------- Attribute = {2C443B1E-F1E2-404F-974D-E21FEF8E70AA}\Names idChunk = 13 BreakType = 2 (Sentence) Flags (chunkstate) = (Value) Locale = 2057 (0x809) IdChunkSource = 13 cwcStartSource = 0 cwcLenSource = 0
VALUE: --------------------------------------------------------------- Type = 31 (0x1f), VT_LPWSTR Value = "Test A"
CHUNK: --------------------------------------------------------------- Attribute = {2C443B1E-F1E2-404F-974D-E21FEF8E70AA}\Names idChunk = 14 BreakType = 2 (Sentence) Flags (chunkstate) = (Value) Locale = 2057 (0x809) IdChunkSource = 14 cwcStartSource = 0 cwcLenSource = 0
VALUE: --------------------------------------------------------------- Type = 31 (0x1f), VT_LPWSTR Value = "Test B"
<rdf:Description rdf:about=""
xmlns:TestSchema="http://test">
<TestSchema:Names>
<rdf:Bag>
<rdf:li>Test A</rdf:li>
<rdf:li>Test B</rdf:li>
</rdf:Bag>
</TestSchema:Names>
</rdf:Description>
Now, whilst we changed to <string, object> - and i'm going to look at this again soon - for some reason, I seem to recall thinking that including the schema into the output would be useful: Pretty sure I found that <string becomes 'Names' - so if a purpose is to allow an application to filter on a specific filter lets say the meta data property output doesn't let you identify the same name from different paths if there is a conflict. So for example, I have
Where we have System.Title, title and Title.
One of them is dc:tittle - the other is TestSchema:Title - and presumably the System.Title is the default document title outside the metadata. This I think is the issue that you were hitting on?
Thanks for the reply. The example you cited seems more like an array of names. Can you upload a small example document?
@mguinness - it was indeed an array of names - sample image uploaded below: (hopefully github doesn't modify it)
When the
includeProperties
option is set to true the metadata is included in the output. Would it be possible to expose a new property on theFilterReader
class as a dictionary? I can put a PR together if you have no objections.