more importantly refactored the pandas and changed it to scipy rank method. scipy directly calls numpy whereas pandas does the same but it has a lot of index series etc math that it does before actually the numpy for calculation of rank
We still get the same answers as before which is pretty cool and proves that it still works the same way as before
edited environment.yml and requirements.txt
If you google comparison between numpy and pandas you might find this, this article says in general numpy is better and performance depends on number of rows. For single row vectors like our data scipy seems better so far with the profiling stats in the notebook tested on a test dataset with 2 cells for human with 950 genes and the test dataset from conftest - http://gouthamanbalaraman.com/blog/numpy-vs-pandas-comparison.html
If you google comparison between numpy and pandas you might find this, this article says in general numpy is better and performance depends on number of rows. For single row vectors like our data scipy seems better so far with the profiling stats in the notebook tested on a test dataset with 2 cells for human with 950 genes and the test dataset from conftest - http://gouthamanbalaraman.com/blog/numpy-vs-pandas-comparison.html