AMP-SCZ / utility

Storehouse for all utility scripts
Apache License 2.0
0 stars 4 forks source link

image03.py is slow #112

Closed tashrifbillah closed 1 month ago

tashrifbillah commented 4 months ago

Hi @dheshanm , I realized that appending rows a DataFrame is slow. Instead, all rows should be appended outside first. The resultant list should be transformed to a DataFrame. I have done it here: https://github.com/AMP-SCZ/utility/commit/3ffad0ace559c2dbbf686da70e68ee2b3e288e0f

Currently, @kcho 's big NDA manifest takes >3 hours to process. This is the line that needs to be modified as above:

https://github.com/AMP-SCZ/utility/blob/3ffad0ace559c2dbbf686da70e68ee2b3e288e0f/nda-transform/image03.py#L111

tashrifbillah commented 4 months ago

A rows=[] should be defined in the main() function underneath. The populate() function will append to that rows. Thus, in the current design, argument passing between functions will still not be required.

tashrifbillah commented 1 month ago

Fixed by https://github.com/AMP-SCZ/utility/commit/251034ee1233cc32b27098ea37c9c127338b4248