Closed anotherdevs closed 3 years ago
Resolved on master. Can you test it on your dataset before releasing it? Thanks.
@halaxa Hot damn, you are fast. I will give it a go. I can probably fully test it over the weekend.
No problem. Just say when you're done.
Does it work for you, @anotherdevs ?
Ping @anotherdevs. I'd like to make a release. Does it work ok?
@halaxa Actually no, because my dataset is so huge, I didn't gain anything from it. It still takes a shit-ton of time to go from one subtree to another subtree. I can't figure out why.
Can you share some anonymized version of your dataset? If there are about ten subtrees of about the same size, an iteration of the first one should be noticeably faster on master branch than on the last stable version because the remaining 9 should not be touched by the parser at all. There's a unit test for it. The best thing would be to hunt it down and provide the failing test case
Also, could you provide the code snippet you use to parse the dataset? Just to make sure it's ok.
Version 0.6.0 has been released. If your problem persists, feel free to post here.
Hi @halaxa
Is there a way to stop iterating the subtree? I have a JSON file of 500 GB with 10 subtrees. Right now the code would continue iterating the subtree and thus wasting alot of time doing so.
The problem is there is no way - with the current code base - to know how to break out of the for loop. I would argue that it is most useful that the iteration stops by it self instead of having to code a break your self. What do you think?
Reference: #21