VamsiSangam / theoryofprogramming

Code repository for codes in the educational website Theory of Programming
http://theoryofprogramming.com/
MIT License
103 stars 56 forks source link

Query on Adjacency List in C #50

Closed ghost closed 7 years ago

ghost commented 7 years ago

Hi, I have generated a graph using Adjacency List for larger number of nodes NN =100000(test run for NN=10 nodes) and it works fine. I have merged the same code in likelihood.c(in the repositary Decimation) and checked the linked list and it keeps on repeating the same node instead of linking the next node. No idea where it goes wrong and checked it many times.

VamsiSangam commented 7 years ago

Closing this, as it was solved. :+1:

ghost commented 7 years ago

@VamsiSangam Hi, Reopening this issue again for the same code. I am getting the degree of the chosen node as 1 and have found the neighboring nodes using adjacency list for an undirected graph. But, it is not necessary that a node should be connected to only one neighbor. There may be many edges and in my case why it is 1 always?

VamsiSangam commented 7 years ago

Well, I didn't understand your query exactly, but degree is the property of a vertex only. I don't think there's anything as "degree of an edge".

The degree of a vertex is the number of edges attached to it. In a directed graph, every vertex has an in-degree and out-degree, where in-degree is the number of edges which end at the vertex, and out-degree being the number of edges which start at the vertex.

In an un-directed graph, we only have degree which is simply the number of edges attached to a vertex. No notion of in or out.

ghost commented 7 years ago

That was a mistake in my query. yes, degree is the property of a vertex and have now edited. My question was, I am getting the degree as one for one edge. But, for all the nodes, I am getting the same degree.

VamsiSangam commented 7 years ago

Can you give the link to the code which is giving that output?

ghost commented 7 years ago

Here it is: https://github.com/Vanitha83/Decimation/blob/master/likelihood.c

ghost commented 7 years ago

The equation that I am trying to solve is attached here: https://github.com/Vanitha83/Decimation/blob/master/discussion.pdf

VamsiSangam commented 7 years ago

You are not supposed to write this condition -

if(temp != 0)
    deg++; //count the incident edges on the respective nodes here:

temp already has a meaningful value, otherwise the line above it, printf("-> %d", temp->j);, would have given a segmentation fault. So remove that condition.

temp is not of integer type, so don't treat it as such :stuck_out_tongue_closed_eyes: . It is a pointer, so you need to write something like -

if (temp != NULL) {
    // code
}

Checking temp != 0 is wrong. I also recommend you to change your while loop to this -

while (temp != NULL) {
    // code
}

Please try to follow good coding conventions while writing code. It makes it a lot easier to read and fix bugs in your code. You can refer this for conventions followed in this repository. Hope this solves your issue.

ghost commented 7 years ago

thank you for providing me the link and let me rewrite the code... ..By the way, just went through your webpage and quite interesting.

VamsiSangam commented 7 years ago

Thanks a lot @Vanitha83 :blush: !

ghost commented 7 years ago

Let me keep the issue open at the moment as I am rewriting and will close, once it is done.

ghost commented 7 years ago

@VamsiSangam Logarithmic of negative produces NaN..It seems that I cant continue with this reasearch problem. But is there any safeguard/transformation since I should use the log of the generated random number.(commented in line 216). As you know, I have fixed the seed.

https://github.com/Vanitha83/Decimation/blob/master/relikelihood.c https://github.com/Vanitha83/Decimation/commit/48cd769efe08aef6922beb2ad6b3175e4172618f

Thanks in advance.

VamsiSangam commented 7 years ago

Not very familiar with transformations :sweat_smile: , but you could take the absolute value of your random number.

ghost commented 7 years ago

Fixed x[i] to 1 and got the result. thanks @VamsiSangam for the convention and pointing out the condition on temp which is crucial. I will acknowledge you if this work is included in my paper as this will be the basic result for a deep learning problem that I am about to start. Let me plot the error now for the convergence. Also, "\t" or "\t\t" is not helping me for proper indent in my output datafile that has 5 columns. Other option?

VamsiSangam commented 7 years ago

You can have a look at the string formatting in C. This and this could help you. Specify your printf such that they have properly fixed precision and width. By doing so your 5 columns should get printed in a very legible format.

ghost commented 7 years ago

@VamsiSangam

Convergence is failing. Checking the implementation again and back to the query. Edited after receiving the comment from @VamsiSangam :

for(index = 0; index < deg; index++)
{
     int j = L[i][index]; // neighboring nodes
    sum = sum + s[j]; // choose the state of the node s[j]
}

I need to implement this logic in my code. But it is already implemented as follows, Am I right?:

   for(index = 0; index < deg; index++){
         if(temp != NULL)  // there is a neighboring node
              ss[v] = 1;       // assign the state of the node if it is say infected.
         else
               ss[v] = 0;   // for susceptible.
          }   
ghost commented 7 years ago

Solved the above issue and kindly ignore..

VamsiSangam commented 7 years ago

Cool :stuck_out_tongue_closed_eyes: ... Btw, when you want to write some code in the comments, wrap them with ``` that way they will look better. And you can also do syntax highlighting by mentioning the language name. You can refer to this, so you can make your code look like this -

double sum = 0;
for(index = 0; index < deg; index++)
{
    int j = L[i][index]; // neighboring nodes
    sum = sum + s[j]; // choose the state of the node s[j]
}

Which is more legible. Just an advice. :smile:

ghost commented 7 years ago

Edited now. thanks..Need to learn a lot from CS majors.

ghost commented 7 years ago

@VamsiSangam Segmentation fault, a bad error while running the code for NN = 100000 nodes. Why the same error is not shown for smaller number of nodes. Please see the below:

 for(i = 0, k = 1, t = 0; i < NN-1; i++)
        {  

             struct Graph* graph  = createGraph(i);            
             addEdge(graph, i, k);
             struct AdjListNode* temp = graph->array[i].head;
             printGraph(graph, &sum, temp);
              ...
              ...
        }

I am trying to dereference an uninitialized pointer, but how to fix the pointer as NULL in particular for Graph* here?

Program received signal SIGSEGV, Segmentation fault.
0x00000000004017c8 in main () at hidden2.c:217
217              struct Graph* graph  = createGraph(i);            
(gdb) 
VamsiSangam commented 7 years ago

Maybe you hit the maximum amount of memory allocated for the program. See if you can optimize the memory consumption. And try to decrease NN, let's say 50000 and see if the same error is showing up. Do this repeatedly in a binary search fashion to pin point approximately for the maximum value of NN does this error not show up.

ghost commented 7 years ago

This I do not know how to do: "See if you can optimize the memory consumption". Googling now. I can go only upto NN = 500 nodes for x[i] = 0.01 and x[i] = 0.1. If I increase NN after this limit, then it shows off segfault error. Can see the convergence for the above choice, but it should be achieved for larger number of nodes. Not able to upload the figure files here at the moment due to poor bandwidth.

VamsiSangam commented 7 years ago

Did hidden2.c give correct output for smaller values of NN?

ghost commented 7 years ago

yes, it is. let me try to upload the output. added now(loghood2.txt) in the following column order..

https://github.com/Vanitha83/Decimation/blob/master/loghood2.txt

 L[i][k]      L1[i][k]     x[i]     x1[i]     sum    C
VamsiSangam commented 7 years ago

I took a look at hidden2.c. Your implementation of adjacency list looks a little suspicious. In struct Graph you have a member array defined as such -

struct AdjList* array; 

You meant to use it as an array but it is actually a pointer to a single node, just like struct AdjListNode* next; is.

What I mean to say is, I think you should change your implementation of adjacency list. Please refer to my implmentation if adjacency list. If possible try to alter your implementation of adjacency list to match mine.

And you are actually creating NN^2 edges, so for NN = 500, you are actually creating 124750 edges! Which is a lot of memory in itself. If you want to create a complete graph then NN = 500 is a reasonable upper limit for a C program :stuck_out_tongue_closed_eyes:

Honestly I don't think it is possible to go higher, at least you can't imagine creating a complete graph for NN = 100,000, because that would be 4999950000 edges! And each edge is roughly 8 bytes (pointer is an unsigned int, and another int for data = 4 + 4). Which amounts to 39999600000 bytes or 39999.6 MB or 39.9996 GB (approx) :stuck_out_tongue_closed_eyes:

ghost commented 7 years ago

Perfect. That is a very detailed explanation. I will alter the adj.list that matches your coding at the moment before proceeding the next calculation. Hope this will be done soon and thanks again for your patience for answering all my queries.

ghost commented 7 years ago

I have one more difficulty. How to call the struct pointer temp in the main.? It just prints the address instead of all the lists.

VamsiSangam commented 7 years ago

Umm... Didn't quite understand it :sweat_smile: You use the -> operator to access the value. Can you post the code here so that I can understand it better?

ghost commented 7 years ago

Yes, I tried the -> operator and that didnt work. Here is the code from the main function. fprintf line.

for(i = 0, k = 1, t = 0.0; i < NN-1; i++)
        {  

             addEdge(graph, i, k);
             struct AdjListNode* temp = graph->array[i].head;
             printGraph(graph, &sum, temp);
            // if(ss[i] = 0)
             //{

               //if(ss[i+1] = 0)
                 // ss[k] = 0;
               //else
                 // ss[k] = 1;

             // }
              if(temp != NULL) 
                 ss[k] = 1;
              else
                 ss[k] = 0; 
                 sum1 = sum1 + ss[k];            
              L[i][k] = x1[i]*ss[k]-(pow((1-x[i]),sum1)/(1-pow((1-x[i]),sum1)))*(x1[i]*ss[k]);
              L1[i][k] = L_old[i][k] + R*L[i][k];
              C = sqrt(pow((L1[i][k]-L[i][k]),2));
              double E = log10(C);
              t = t+dt;
              double dt1 = log2(t);
             // double dt1 = log2(hh[fi]);
              fprintf(ed,"%0.6lf\t\t%0.6lf\t%0.6lf\t%0.6lf\t%0.6lf\t%0.6lf\t%0.6lf\t%0.6lf\n", L[i][k], L1[i][k],   x[i],x1[i],sum1, C, E, dt1);
              fprintf(ls,"%p\n", temp->next); 
              k++;

}
ghost commented 7 years ago

solved the above query :). I should have used j instead of next.

ghost commented 7 years ago

@VamsiSangam

Are you familiar with gradient ascent?

VamsiSangam commented 7 years ago

Learnt something about it in Optimization techniques course :thinking: . But I don't remember much. Maths is my weakness :sweat_smile:

VamsiSangam commented 7 years ago

Well I didn't understand your question :sweat_smile: If you bring me a complete well-written program and ask me to fix an error, I could. But I can't help you with your research, much less in Maths :stuck_out_tongue_closed_eyes:

ghost commented 7 years ago

The log-likelihood function is maximized for the step size of 0.01. The infected neighbor is chosen randomly. Comparing this random value with the infection rate which is lambda in the calculation, the state of the node is chosen (infected or susceptible). The log(1-lambda) decreases the error at first and then increases at each time step. ERROR IN THE ADJACENCY LIST.CODE SHOULD BE REWRITTEN. CONVEXITY OF THE OPTIMIZATION SHOULD ALSO BE CHECKED.