Quiz #5 question - Githubissues

jhyearsley commented 8 years ago

Edit**I didn't write this! haha not sure how it got here ;), but I suppose this is a good summary of my original question (minus that my question was actually about the alternate way to compute the number of threads being executed in parallel)

I am wondering why A is the only correct answer on question 3. It was demonstrated in class, that although there is a possibility and maybe even a probability of overwriting the value of 'id' since all threads have access, I don't really see why that is necessarily "wrong"

It compiles, runs, and in this case will very likely report the actual current thread it is on, right? I know that this introduces a very dangerous possibility for a bug, but I am not really sure why it is necessarily wrong. Also, it isn't really exactly clear that we even care about it reporting the reporting the current thread. Maybe we are just curious about which thread last wrote to 'id'. Clearly you meant for it to mean, 'print the current thread that is executing' or something like that, but with chunking, maybe all we care about is when the last thread completed its task, and it will be more current this way. (a stretch?)

jhyearsley commented 8 years ago

Asked question about quiz not thinking that everyone has not taken yet. Will re-ask tomorrow

cswiercz commented 8 years ago

Ooops. No worries. Thank you for deleting the comment. (Though, I think anyone who subscribed to this repo receive automatic email notifications when someone posts.)

cswiercz commented 8 years ago

For reference, the question was:

If anything, what is wrong with the following code?

int id;
#pragma omp parallel num_threads(4)
{
  id = omp_get_thread_num();
  printf("The current thread id is: ");
  printf("%d \n", id);
}

The correct answer was: "The variable "id" need to be declared "private" in the omp parallel directive." 88 /100 students answered this question correctly. 11/100 answered "The code is valid. There is actually nothing wrong with it."

cswiercz commented 8 years ago

The key problem with this code is that id is, by default, a shared variable meaning that every spawned thread reads the value of id from the same exact location in memory. This has the side-effect that all of the threads can write to the same location in memory.

This leads to the following possible race condition:

Thread 0: write 0 to id
Thread 1: write 1 to id
Thread 0: read id (which is now equal to 1) and print
Thread 1: read id (which is still equal to 1) and print

By declaring id as private each thread will receive their own copy to work with, which completely avoids any possibility of a race condition.

cswiercz commented 8 years ago

By the way, the same principle applied to the question concerning the code

int nthreads = 0;

// the following parallel block is executed once by each thread
#pragma omp parallel num_threads(8)
{
  nthreads += 1;
}

By default, nthreads is shared. A similar race condition can potentially occur.

jhyearsley commented 8 years ago

Yeah, my question was actually in regard to the 8 nthreads quiz question. I wrote up the code and even after adding a for loop to "take up some time" (100000000 iterations) got the answer 8 in several experiments. Though I understand that there is theoretical race condition occuring, I'm wondering when negligible probability actually overrides the necessity of worrying about a race condition. If we are supposed to be thinking in terms of probability I think the pedantic mathematician would argue that there is never full certainty you will get the answer 8 even if the block is not executed in parallel (e.g. there is a small probability while typing this message my hands fall through the computer). Maybe I'm being a little extremal but I think it's an interesting point to consider.

yusufmansour commented 8 years ago

In regards to the nthreads question, wouldn't adding a big time sink reduce the chances of a race condition? My thought is that you create a larger time variance between the time the threads are created and when they execute nthreads +=1. Where as if you do not have a time sink, all threads are created and attempt to execute nthreads +=1 more closely together in time. Out of curiosity, I tried the below code and got something between 97% and 99% where the output of nthreads was 8 other times it was typically 7.

#include "omp.h"
#include <stdio.h>

int main()
{
  int ctr = 0;
  for (int i = 0; i<1000; ++i){
    int nthreads = 0;

    // the following parallel block is executed once by each thread
    #pragma omp parallel num_threads(8)
    {
      nthreads += 1;
    }
    if (nthreads ==8){
      ctr +=1;
    }
  }
  printf("%d/1000\n", ctr);
  return 0;
}

cswiercz commented 8 years ago

Yeah, my question was actually in regard to the 8 nthreads quiz question. I wrote up the code and even after adding a for loop to "take up some time" (100000000 iterations) got the answer 8 in several experiments. Though I understand that there is theoretical race condition occuring, I'm wondering when negligible probability actually overrides the necessity of worrying about a race condition.

I think this is a poor argument. In this class we are working with toy problems. But what if your parallel code takes weeks to execute. Do you want to roll the dice on having to wait a month, instead? I can always counter with the "nuclear reactor" argument: do you really want to gamble on 99% stability with an objective that really needs 100% stability?

If we are supposed to be thinking in terms of probability I think the pedantic mathematician would argue that there is never full certainty you will get the answer 8 even if the block is not executed in parallel (e.g. there is a small probability while typing this message my hands fall through the computer). Maybe I'm being a little extremal but I think it's an interesting point to consider.

Attributing computer behavior to human mistakes is a very extreme way to look at this problem. Computers are deterministic. You can't blame the computer for your incorrectly-written code.

The best way to really see what happens in this simple is to generate the corresponding assembly code and really see what happens at the register level. Perhaps you are correct and on modern architecture there is a special APU instruction for accumulation by another value that does not follow the usual read-accumulate-write process. However, I doubt this is true of every hardware, especially older processors.

jhyearsley commented 8 years ago

The best way to really see what happens in this simple is to generate the corresponding assembly code and really see what happens at the register level. Perhaps you are correct and on modern architecture there is a special APU instruction for accumulation by another value that does not follow the usual read-accumulate-write process. However, I doubt this is true of every hardware, especially older processors.

That was really the answer I was looking for. How does one go about generating the corresponding assembly code for future reference?

All my other ranting was done purely as a thought experiment (it's fun to wonder!). Though I am in agreement that the computer is deterministic, I still think it's worth knowing the what/where/why's of uncertainty within the machine itself (which self-admittedly I do not know and this is where my curiosity comes in!). Determinism is a funny concept which tends to break down on different scales. Since the inner workings of computer hardware are still essentially black magic for me, I will continue to maintain my skepticism and silly thought experiments which I'm fully aware of are not grounded in formality :)

cswiercz commented 8 years ago

That was really the answer I was looking for. How does one go about generating the corresponding assembly code for future reference?

Take a look at

$ gcc -S my_code.c

this will output a file called my_code.s which contains the corresponding assembly. Just to make things more frustrating, the map from assembly to machine code is mostly one-to-one, but not entirely. And to make things even more frustrating, assembly is architecture-dependent. Most processors nowadays are based on the x86-64 arch but some aren't. Furthermore, with certain optimizations enabled it can be hard to tell how your C code was transformed into assembly.

All my other ranting was done purely as a thought experiment (it's fun to wonder!). Though I am in agreement that the computer is deterministic, I still think it's worth knowing the what/where/why's of uncertainty within the machine itself (which self-admittedly I do not know and this is where my curiosity comes in!). Determinism is a funny concept which tends to break down on different scales. Since the inner workings of computer hardware are still essentially black magic for me, I will continue to maintain my skepticism and silly thought experiments which I'm fully aware of are not grounded in formality :)

Of course! My apologies if I came off as harsh. There are a lot of student questions to answer and not enough time to answer them all. All instructors crave the good questions, such as these, especially after last minute homework questions. :)

The deeper one dives into understanding the inner workings of a computer and the software to hardware map the more...well...complicated things get. High level languages like Python do a lot to hide what's doing on in the background. That's one of the reasons why it is so quick and easy to get something up and running in Python! The lower-level you go the more you have to think about what you're writing (imagine that!) and one's "code lines-per-second" drops significantly.

Skepticism is an essential tool in writing error-free code. Keep it around.

As I mentioned at the beginning of this course we will only be scratching the surface. I would love to spend an entire year diving into deeper workings of the hardware as well as the software. (We're only using a small corner of the C programming language in this class.) Whether or not enough students, such as yourself, are interested in such a course sequence should be evaluated by the Applied Mathematics department.

jhyearsley commented 8 years ago

Cool, I'll take a look at that! And I'm in full agreement, I think it would be awesome to have more time to dig deeper underneath the hood and explore in more depth the connections between hardware and software. The Amath department certainly has my vote!

uwhpsc-2016 / syllabus

Quiz #5 question #33