erinxocon / pyp

Automatically exported from code.google.com/p/pyp
0 stars 0 forks source link

Slow for big entries (SOLVED) #29

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. cat file | pyp "len(pp)"         # for a file with 500000 numbers
2.
3.

What is the expected output? What do you see instead?
the expected output is 500000

The problem is the method: def flatten_list(self, iterables)
it is recursive and does too many list manipulations.

I've rewritten it:

    def flatten_list(self, iterables):
        '''
        returns a list of strings from nested lists
        @param iterables: nested list to flatten
        @type iterables: list<str>
        '''
        out = []
        stack=[[iterables,0,len(iterables)]]
        while stack:
            curIter,pos,limit=stack[0]
            if pos==limit:
                stack.pop(0)
                continue
            if type(curIter[pos]) not in [str, PypStr]:
                stack[0][1]+=1
                stack.insert(0,[curIter[pos],0,len(curIter[pos])])
            else:
                out.append(curIter[pos])
                stack[0][1]+=1

        return out

Today is the first day I've used pyp, it's great but I've found this issue,
I think that would be great to make it faster, could anyone test this method
and maybe patch the source?

Thanks,
C

Original issue reported on code.google.com by deepbit on 16 Sep 2014 at 5:33

GoogleCodeExporter commented 8 years ago
awesome! can you post some speed tests? we'll test and put in if it works as 
expected. We're doing a bunch of stuff next month.

thanks!

Toby

Original comment by tobyro...@gmail.com on 16 Sep 2014 at 6:03

GoogleCodeExporter commented 8 years ago
$ for i in {0..100000}; do echo $i; done > /tmp/numbers
$ time cat numbers | pyp "len(pp)"
100001

real    1m27.045s
user    1m26.166s
sys 0m0.606s

(AFTER CODE PATCH)
$ time cat numbers | pyp "len(pp)"
100001

real    0m0.927s
user    0m0.871s
sys 0m0.050s

Best
C

Original comment by deepbit on 16 Sep 2014 at 9:12

GoogleCodeExporter commented 8 years ago
Because I haven't written complex PYP commands I am not sure wether the patch 
is completely bug free but I am pretty sure you can test it quickly.

Best
C

Original comment by deepbit on 16 Sep 2014 at 9:15

GoogleCodeExporter commented 8 years ago
wow, truly impressive...100 times faster.  We'll test this and get back with 
you.

thanks again!

t

Original comment by tobyro...@gmail.com on 16 Sep 2014 at 9:23

GoogleCodeExporter commented 8 years ago
testing this now. looks ok so far...going to release a big update to pyp 
including this...do you have any further revisions?

thanks!

toby

Original comment by tobyro...@gmail.com on 7 Feb 2015 at 12:17

GoogleCodeExporter commented 8 years ago
it looks like this works great for lists...increase speed by 100x...line by 
line speed is still the same...let me know if you have any ideas...

cheers,

toby

Original comment by tobyro...@gmail.com on 7 Feb 2015 at 1:09

GoogleCodeExporter commented 8 years ago
here is the latest if interested.  pretty close to releasing it.

Original comment by tobyro...@gmail.com on 7 Feb 2015 at 1:45

Attachments: