Open rezaBarzgar opened 11 months ago
@hosseinfani Dr. Fani,
If you approve, I plan to assign this task to Marco. I will ask him to run it on Compute Canada and share the results with us. I believe it will be a good learning experience for Marco to become familiar with the process of running large-scale projects on cloud resources. I appreciate your thoughts on this.
Thank you
@rezaBarzgar perfect. thank you.
@MarcoKurepa
There is comprehensive documentation available on using Compute Canada in the MS Teams -> SCS - HF Research Group / General / files / Library / ComputeCanada Guide.docx
folder. Once you create an account, either Dr. Fani or I will send you an invitation link. Also, you can update the file with your experience if you feel it needs to be mentioned.
IMBd
(IMPORTANT)
fnn_emb
bnn_emb
loss = 'normal'
loss = 'SL'
loss = 'DP'
fnn_emb
bnn_emb
loss = 'normal'
loss = 'SL'
loss = 'DP'
fnn_emb
bnn_emb
loss = 'normal'
loss = 'SL'
loss = 'DP'
fnn_emb
bnn_emb
loss = 'normal'
loss = 'SL'
loss = 'DP'
These results are for a research paper that we aim to submit at ECIR 2024
On it!
@hosseinfani Please confirm me on Compute Canada.
@MarcoKurepa already done!
Response from Compute Canada support is already in. Gmail - Error 4 on Cluster Beluga.pdf
@MarcoKurepa I asked Mohammad to tar his file. Hopefully, we can solve the issue soon.
I changed computecanada.sh
by removing #SBATCH --gpus-per-node=1
because it had a redundancy conflict causing an error with #SBATCH --gres=gpu:v100:1
.
The job is now pending I will update you when it has been completed.
@MarcoKurepa I think you need to change the email notification to your own. I've been receiving some notification from computecanada :D
@MarcoKurepa
I think you need to change the email notification to your own. I've been receiving some notification from computecanada :D
Apologies for that! I'll look into it🙂
Sorry if you got any more notifs today, I just realized they were coming from the slurm script! You shouldn't receive any more emails now:)
@MarcoKurepa no worries. correct, I don't receive them anymore. tnx.
@MarcoKurepa Hey Marco, how are the experiments going on? Also, Have you started the write-up? I just wanted to remind you that these tasks are the priority gently. Please update me regularly even if you have not made any progress or you are busy with some other things.
@MarcoKurepa Hey Marco, how are the experiments going on? Also, Have you started the write-up? I just wanted to remind you that these tasks are the priority gently. Please update me regularly even if you have not made any progress or you are busy with some other things.
Hey Reza, sorry for the lack of updates. I have mostly been working on the computecanada cluster, although to be frank I've been having a lot of hiccups with it especially in getting the correct versions of the libraries. At the moment I am still trying to squash errors relating to runnign the script but I have an .sh
ready to run once I've gotten to the point where I can run a sample dataset in the terminal.
I'll make sure to keep you posted moving forward.
While I was working on the CL Visualization issue, I made an error which resulted in VSCode crashing and I lost the progress made on training CL on the Github dataset. However, in the 48~ hours that it did run ti only finished 2 epoches (still the first fold), so I think it'd be better to run it off Reza's computer regardless. Also I would've needed to restart the run either way to include the expert loss logging script, so no real progress was lost, just a mild annoyance in the end.
@rezaBarzgar
This issue is for logging the results of the OpeNTF with 2 different loss-based curriculum learning methods (
Data Parameters
,SuperLoss
) on 4 datasets (IMDb
,GitHub
,DBLP
,USPT
) by 2 models that have state-of-the-art results (fnn_emb
,bnn_emb
).In These experiments, hyperparameters should remain the same.
0.001
10
128
True
with patience5
0.9