gbowne1 / json-maestro

JSONMaestro is a powerful tool designed for cleaning and processing JSON-like files. It simplifies tasks such as removing comments, eliminating duplicate keys, adding schema keys, and sorting keys. Ideal for developers working with configuration files and API responses, JSONMaestro enhances data integrity and prepares JSON data for further analysis
MIT License
2 stars 2 forks source link

Removing Comment symbols #32

Open shoshta73 opened 5 hours ago

shoshta73 commented 5 hours ago

Current implementation of remove_comments function does not take in account comment symbols in strings which should not be treated as comment

shoshta73 commented 5 hours ago

Is it ok if push changes to relevant test files / test data directly to master?

gbowne1 commented 5 hours ago

Non-breaking changes can be pushed directly to master for maintainers/contributors only, especially if the changes have been tested. I shouldve put that in the contributing doc. I prefer in PR's to show working changes and possibly diffs.. ie before and after the changes as well as passing tests.

shoshta73 commented 3 hours ago

I have pushed updated test file into tests/comment-symbols

shoshta73 commented 3 hours ago

Well this is interesting...

image

shoshta73 commented 3 hours ago

I assume maybe filesize over 1MiB could pose a prose a problem

gbowne1 commented 3 hours ago

I am assuming that we are shoving stuff into memory for parsing. This will get bad with REALLY large files.

alrighty will have a look.

shoshta73 commented 3 hours ago

Large files could also cause object misalignment within python interpreter if they are not chunked wich probably lead to "/*" to be treated a comment

gbowne1 commented 3 hours ago

yes, thats true it could. we would need to test this on some really large data.

shoshta73 commented 3 hours ago

So maybe dump like 100 npm dependencies in package.json download them and only store package-lock file to test this?

gbowne1 commented 3 hours ago

Yes, this could go along with #30.

I've seen some gnarly looking package.json files created by random tools like npm init, npx create-react-app, etc. after a while it gets wild, especially the more you have in the project.