duplicate sections / keys

vinbarnes commented 10 years ago

I'm opening this issue primarily for discussion purposes.

What approach would you take to guard against duplicate sections and duplicate section keys?

Per the requirements,

Key names within each section must be unique (but key names may be reused in different sections). Each section name must be unique within the file.

morganwildermuth commented 10 years ago

Oh! When I read that requirement I assumed the source data was clean but additions could only add unique section names and key-names within section. If I had to confirm the source data was clean I'd add a requirement in the in the file_to_hash if statement where the current_section was only set if line.is_new_section? That method would check the already created hash to ensure it was actually creating a new hash. I'd then set the current_section value to a value like "invalid" and create a check where if the current_section is "invalid" then only the check for whether the line.is_section? occurs so I'm not adding the keys of the duplicate section to the file_hash I'm creating. I could also potentially just add the key (assuming they're unique) to the original section_key, but I think that potentially causes more problems than it solves.

As for duplicate section keys, I'd create a similar process in the if statement within file_to_hash checking if line.is_key_value_pair? Before setting it I'd first check the file_hash to make sure it was unique. If it wasn't, it wouldn't be included. Again, I'd then set current_key to "invalid" in case the next line was a wrapped line that was going to be depending on that value to know where to go. Then inside set_wrapped_line there'd be a check to ensure current_key wasn't invalid before attempting to set_wrapped_value.

To be thorough, it'd also probably be a good idea to create a hash called duplicate_data that has a key "section" with its value being an array of all the section names that duplicated and a key "key" with its value being an array of all the keys and their associated sections that were duplicate. Then after the parser was run it could return that data so the user knew what happened and wasn't confused about missing pieces.

Haha, sorry about closing the previous issues; I should have realized they were discussion oriented rather than simply question/answer style. I'd love feedback on the above! I'm sure there are some errors of logic; I find it hard to catch all the potential miss-steps in changing code without actually writing it and running the tests.

vinbarnes commented 10 years ago

I think the easiest approach would be to abort the file parsing on the first invalid key (whether a duplicate section or duplicate key). You could also combine that with your duplicate entry list to provide useful user feedback. Good idea.

Haha, sorry about closing the previous issues

No problemo. It is a bit easier to chat in an Issue or PR rather than via commits, but this seems to be working.

morganwildermuth commented 10 years ago

Awesome.

Thanks for being so thorough with this; I'm loving your feedback!

morganwildermuth / LPCodingChallenge

duplicate sections / keys #6