fani-lab / SEERa

A framework to predict the future user communities in a text streaming social network based on the users’ topics of interest.
Other
4 stars 5 forks source link

Compute running time for pipelines in Pycharm (pstat) #55

Open Lillliant opened 2 years ago

Lillliant commented 2 years ago

@hosseinfani @soroush-ziaeinejad

Update

Pycharm has a feature to profile function execution time, call counts and call graphs, so I've been using it to profile the different pipeline combinations. GitHub doesn't seem to support the .pstat files, so I uploaded them to MS Teams for the time being. The files can be viewed using Tools>Open CProfile snapshot in Pycharm. When you hover over the functions, you can also see the location of the code.

Current Issue

As for now, I cannot run any of the pipeline except for lda.gensim combinations after the new updates to the code. I initially thought it was because I didn't install bitermplus, so I tried to resolve the "C++ redistributable version must be 14 and above" issue by downloading the C++ build tool along with the redistributable. I don't think it solved the issue, so I'm looking through the code to see what may be the issues.

Planned Next Step

hosseinfani commented 2 years ago

@Lillliant Thanks for the update

@soroush-ziaeinejad Why the codeline is not stable anymore?

soroush-ziaeinejad commented 2 years ago

Hi @hosseinfani and @Lillliant

I checked the latest code on GitHub and it worked for all combinations (except gsdmm due to a bug which is resolved, please pull) on toy.synthetic dataset. On bigger datasets, you might face MemoryError.

One thing that you should consider is parameter adjusting. For instance, setting the number of topics as 50 on toy.synthetic may cause very low weights for user graphs and then all connections will be cut off because weights are under the threshold. Then an error may be raised for trying to create a graph without any connections (it says the summation of weights should be non-negative and non-zero)

If you still have problems with running SEERa, please send me the ParamsTemplate.py and a screenshot of the error.

Thanks :)

hosseinfani commented 2 years ago

@Sharjeeliv fyi, see above.

hosseinfani commented 2 years ago

@Lillliant Hi Christine, I expect you to update us weekly or so. Please work with @soroush-ziaeinejad to solve the performance issues. Thanks.

Lillliant commented 2 years ago

Hi @hosseinfani @soroush-ziaeinejad, sorry for not posting the updates sooner. I think I'll be able to post more frequently now that most of my personal and school matters have resolved.

So far:

Lillliant commented 2 years ago

Update

Next Step

Lillliant commented 1 year ago

Update

hosseinfani commented 1 year ago

@Lillliant Thank you for the PR and clean code/documentation :)

soroush-ziaeinejad commented 1 year ago

Hey @Lillliant

Thanks for your PR. I didn't look at the code carefully but I get this error when I run the code with argument -p True.

main.py: error: unrecognized arguments: True

Should I change something?

Lillliant commented 1 year ago

Hey @Lillliant

Thanks for your PR. I didn't look at the code carefully but I get this error when I run the code with argument -p True.

main.py: error: unrecognized arguments: True

Should I change something?

Hi @soroush-ziaeinejad, just the command line flag -p is sufficient to activate the profiler: python -u main.py -r toy -t [...] -g [...] -p

Lillliant commented 1 year ago

Update

Lillliant commented 1 year ago

Update