Random ML question/last figure update

benfig1127 commented 3 years ago

@mikeizbicki Hey Mike, I was wandering around on reddit this morning and I ran into someone doing some cool hyper-parameter training with a technique called an evolution algorithm. I don't think we ever covered this in 181, but it seems like something that might be useful for me to know. Any thoughts on it's usefulness or advise on looking into it? Thanks!

Also the accuracy table will be fully filled out by early this afternoon :).

mikeizbicki commented 3 years ago

Meh, I'm personally not very impressed with genetic algorithms. They sound cool to laymen because "evolution", but they don't actually perform better than just randomly searching on most distributions. In many ways, they're much worse than random because random has theoretical guarantees and genetic algorithms don't, and genetic algorithms are super complicated to implement whereas random is trivial. Other people certainly disagree with me, but in general, I've found that the more mathematically inclined people are, the more they dislike genetic algorithms.

benfig1127 commented 3 years ago

"Meh" LOL, and hey are you calling me a layman? :p Ahh, I see, so I should not waste my time on it, in your opinion. On the other hand, does q-learning (reinforcement learning) fall into that category as well or is that a whole different beast entirely?

mikeizbicki commented 3 years ago

Whole different beast. It's not something I've ever needed, but for the problems it solves it's the only thing that solves.

When should I expect the table uploaded?

benfig1127 commented 3 years ago

In 2.15 hours, Im in a meeting at the moment, but will be done at 5 and the table should be up immediately after im done with the meeting. Benjamin Figueroa Claremont Mckenna College m: (510) 931-0248 e: bfigueroa20@students.claremontmckenna.edu

On Sun, Sep 13, 2020 at 2:56 PM Mike Izbicki notifications@github.com wrote:

Whole different beast. It's not something I've ever needed, but for the problems it solves it's the only thing that solves.

When should I expect the table uploaded?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/benfig1127/Summer-Research-2020/issues/35#issuecomment-691731119, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOTU65YDQWTOPKD6V62WIKDSFU5ZVANCNFSM4RKVHDRA .

benfig1127 commented 3 years ago

table has been updated!

mikeizbicki commented 3 years ago

You still need to update the values within the text.

Also, I don't think the numbers in the table are correct. The fact that there is a Chinese row/column makes me think you did this on the UN dataset. We need it for the RT dataset, which has German instead of Chinese. Is that what happened?

benfig1127 commented 3 years ago

Yes, I did it on the UN dataset as a while back you had told me that we were including this table for UN not for the RT. I think I have most of the distances already calculated for RT, but I need to go check it. Are we absolutely sure this table needs to be for RT instead? Because I have a note on my desk saying that you wanted it for UN, it should be fine because RT is much smaller in size, but I just need to make sure before I re-do it.

mikeizbicki commented 3 years ago

RT is the one that we care about the most, but we would ideally have both so that we can justify that the results for RT are correct.

benfig1127 commented 3 years ago

Okay I will log on in 5 minutes and check to see if it's possible to still get the table changed to RT in time, I am not sure how many of the distances I have for RT, I think quite a lot, but I need to check. I will let you know in 5 minutes.

benfig1127 commented 3 years ago

So I just checked the files, and I have close to all of the RT distances saved just in a different data format. It will take me a second to rewrite a new function to calc the accuracies but i should be able to get it for you within the hour.

benfig1127 commented 3 years ago

I'm having a hard time guaging whether a tau of .9 is correct for the RT dataset, here are the outputs for the lang comparisons I have: (FYI, right now Im missing fr-es,ger-es,ab-en, and ger-ru, need to wait for those to calculate)

Name: ab_ab_min_dist_list.combined
Accuracy: 1

Name: ab_es_min_dist_list.combined
Accuracy: 0.3736

Name: ab_fr_min_dist_list.combined
Accuracy: 0.21755

Name: ab_ger_min_dist_list.combined
Accuracy: 0.25435

Name: ab_ru_min_dist_list.combined
Accuracy: 0.4042

Name: en_ab_min_dist_list.combined
Accuracy: 0.33049

Name: en_en_min_dist_list.combined
Accuracy: 1

Name: en_es_min_dist_list.combined
Accuracy: 0.53225

Name: en_fr_min_dist_list.combined
Accuracy: 0.39192

Name: en_ger_min_dist_list.combined
Accuracy: 0.37741

Name: en_ru_min_dist_list.combined
Accuracy: 0.46308

Name: es_ab_min_dist_list.combined
Accuracy: 0.32546

Name: es_en_min_dist_list.combined
Accuracy: 0.58923

Name: es_es_min_dist_list.combined
Accuracy: 1

Name: es_fr_min_dist_list.combined
Accuracy: 0.38544

Name: es_ger_min_dist_list.combined
Accuracy: 0.38526

Name: es_ru_min_dist_list.combined
Accuracy: 0.426

Name: fr_ab_.min_dist_list.combined
Accuracy: 0.26438

Name: fr_en_.min_dist_list.combined
Accuracy: 0.57708

Name: fr_fr_.min_dist_list.combined
Accuracy: 1

Name: fr_ger_.min_dist_list.combined
Accuracy: 0.42805

Name: fr_ru_.min_dist_list.combined
Accuracy: 0.38248

Name: ger_ab_min_dist_list.combined
Accuracy: 0.29154

Name: ger_en_min_dist_list.combined
Accuracy: 0.5107

Name: ger_fr_.min_dist_list.combined
Accuracy: 0.37859

Name: ger_ger_.min_dist_list.combined
Accuracy: 1

Name: ru_ab_min_dist_list.combined
Accuracy: 0.21314

Name: ru_en_min_dist_list.combined
Accuracy: 0.3765

Name: ru_es_min_dist_list.combined
Accuracy: 0.25763

Name: ru_fr_min_dist_list.combined
Accuracy: 0.15708

Name: ru_ger_min_dist_list.combined
Accuracy: 0.17782

Name: ru_ru_min_dist_list.combined
Accuracy: 1

Do these look fine?

mikeizbicki commented 3 years ago

We should expect much smaller numbers. Those look reasonable.

benfig1127 commented 3 years ago

Okay, that's what I thought, I'll input those into the tex file after my meeting is finished at 9pm. And I will wait untill the other distances are finished later tonight and then input them in as soon as they are done.

On Sun, Sep 13, 2020, 8:20 PM Mike Izbicki notifications@github.com wrote:

We should expect much smaller numbers. Those look reasonable.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/benfig1127/Summer-Research-2020/issues/35#issuecomment-691788080, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOTU654LSCJZACJRWQZ2JETSFWDXJANCNFSM4RKVHDRA .

benfig1127 commented 3 years ago

Table has been updated now, am adding other fixme values within an hour or so

benfig1127 commented 3 years ago

All the fixme values have been updated, I left a few comments, but other than adding in the last 4 languages for the Table it looks almost completely finished!

mikeizbicki commented 3 years ago

I've pushed an updated version. I'm leaving it all in your hands now. You should add the final numbers into the table and do a proof reading.

Also, don't wait until the last minute to submit. If you have internet problems, or anything goes wrong, and you miss the deadline, then there's nothing we can do.

benfig1127 commented 3 years ago

Will do, I'll probably submit tomorrow afternoon ie the 14th. Thanks for all the last minute help Mike!

benfig1127 commented 3 years ago

Do you intend to leave the 6 language tsne plot or should I swap it back to the arabic Russian one, as ur caption mentions russian arabic specifically.

benfig1127 commented 3 years ago

Also I thought u were going to add the M_i,I tags to the histogram plot and put the box around it?

mikeizbicki commented 3 years ago

It appears as the Arabic-Russian plot on my computer. That's what it's supposed to be.

Sorry, I don't have time to add the $M_{i,i}$ labels.

benfig1127 commented 3 years ago

No problem, I will make sure it's the correct plot and try and add the histogram key on my own.

benfig1127 commented 3 years ago

This is the correct plot right? or is it the other one:

mikeizbicki commented 3 years ago

It's the second one, tsne.png. The problem seems to be because you also had a file named tsne.pdf which is of a different plot. You definitely need to work on you variable/file names.

benfig1127 commented 3 years ago

Yep, that's what I figured out last night, I'm going to submit it, but I have a quick question I need to show you about the submission process. Can I come by class like 2 minutes early?

benfig1127 / Summer-Research-2020

Random ML question/last figure update #35