fani-lab / SEERa

A framework to predict the future user communities in a text streaming social network based on the users’ topics of interest.
Other
4 stars 5 forks source link

Compute the carbon footprint of our pipeline for each baseline #46

Closed hosseinfani closed 2 years ago

hosseinfani commented 2 years ago

@Sharjeeliv When you're done with the installation of seera, as we briefly talked about carbon footprint of models, start this task. We want to know what is the carbon footprint of each baseline (topic modeling by graph embedding)

Here is the helpful link to find the library that calculates carbon footprint of a model: https://twitter.com/ZetaVector/status/1547916747507871744?s=20&t=MZWEJDzLVUSx0hOKpQC5-Q

Sharjeeliv commented 2 years ago

For this task should I submit the code or the results? In the meantime, I've installed the package and I'm working on implementing it.

Also, the project is still not available on SharePoint for me, I've contacted Dr. Brunet and he has granted me an extension on the contract for the time being.

hosseinfani commented 2 years ago

@Sharjeeliv You can send pull request (pr) when you're sure that your code is ready to be merged to the codebase.

I already submitted the project and we're waiting for Dr. Kobti to confirm the project. You can follow up from Dr. Kobti as well.

Sharjeeliv commented 2 years ago

Update: Reviewing git and GitHub

Sharjeeliv commented 2 years ago

Update

To-Do

Notes

(1) https://learntocodetogether.com/create-your-first-pull-request/

hosseinfani commented 2 years ago

@Sharjeeliv Thank you. I had a look at the log file. It shows you could successfully run the pipeline. Awesome. However, I was not able to see the carbon footprint result. Looking forward to your pr. thanks.

Sharjeeliv commented 2 years ago

Hi Sir, I've been sick this week so I haven't been able to wrap up and send pr, I will do this as soon as I feel I can (hopefully in 1-2 days).

hosseinfani commented 2 years ago

@Sharjeeliv Sorry to hear that. Wish you fast and full recovery. Take your time and rest.

Sharjeeliv commented 2 years ago

Update

Sharjeeliv commented 2 years ago

Update

hosseinfani commented 2 years ago

@Sharjeeliv You only need to test your code on a running combination. Leave the bugs in other combinations to @soroush-ziaeinejad

Sharjeeliv commented 2 years ago

I've opened a pr for this feature. The pipeline is not working for me, but since the feature has been done for some time and successfully tested on earlier versions, it seemed best to open the pr for review.

hosseinfani commented 2 years ago

Hi @Sharjeeliv Thank you. There are some issues with your pr: 1) Please only include the files or changes that are necessary. I reviewed your pr and seems only requirement.txt and TopicModeling.py is necessary, am I right?

2) the carbon tracer only targets the topic modeling layer. However, we need to trace the whole pipeline. So, you should put it in the main.py here:

https://github.com/fani-lab/SEERa/blob/4e032be147483a756872927fb89de66bc7a9b274/src/main.py#L162

Let me know if you need more info on this.

Sharjeeliv commented 2 years ago

Update

def run(tml_baselines, gel_baselines, run_desc):
    for t in tml_baselines:
        for g in gel_baselines:
            tracker = EmissionsTracker()  # We want to reset the tracker on each iteration to get the emission of each combination
            tracker.start()
            try:
                cmn.logger.info(f'Running pipeline for {t} and {g} ....')
                baseline = f'{run_desc}/{t}.{g}'
                with open('ParamsTemplate.py') as f:
                    params_str = f.read()
                new_params_str = params_str.replace('@baseline', baseline).replace('@tml_method', t).replace(
                    '@gel_method', g)
                with open('Params.py', 'w') as f:
                    f.write(new_params_str)
                importlib.reload(Params)
                main()
            except:
                cmn.logger.info(traceback.format_exc())
            finally:
                emissions: float = tracker.stop()
                cmn.logger.info(f'Pipeline Emissions for {t} - {g}: {emissions}')
                cmn.logger.info('\n\n\n')
hosseinfani commented 2 years ago

@Sharjeeliv any update? can we have the log of the carbon for all combinations on toy sets?

Sharjeeliv commented 2 years ago

Yep, this was tested to work for all Gensim combinations, it does not work with Mallet (it has issues running bat file on macOS, I have not found a solution yet). Soroush recommended that I make it work for Gensim, and he'll test it out for Mallet.

hosseinfani commented 2 years ago

@Sharjeeliv thanks. @soroush-ziaeinejad would you please verify this task and close this issue. Then, assign a new task to Sharjeel. Thanks.

soroush-ziaeinejad commented 2 years ago

@Sharjeeliv Thank you for pr. @hosseinfani Sure, I'll do it tomorrow.

soroush-ziaeinejad commented 2 years ago

@hosseinfani, Done.

Thanks, @Sharjeeliv