OpenState-SDN / spider

OpenState-based fault resilient SDN pipeline design with programmable failure detection and recovery
http://arxiv.org/abs/1511.05490
Apache License 2.0
12 stars 6 forks source link

dpctl: error during transaction. #2

Closed AbdullahAlshraa closed 7 years ago

AbdullahAlshraa commented 7 years ago

Dears,

I am working on the same area (Fault management) based on SPIDER. I use the same method to send ping packets from source to destination. but sometimes, I see crash forwarding tables ! and the " dpctl: error during transaction" appears as an error message.

for instance, when I use Norway topology to send ping from switch number 1 to 22 through 20 (1 -> 20 ->22) I can see this inside 22 host, however this reply dose not reach to the source switch !!!

dptclerror

I had checked out the reverse path or the path from switch number 22 to 1 through 21. and I found its table destroyed, thus the packet did not reach. the second picture before the ping packet reached host 22

capture

the thired after the ping packet reached host 22 which the first picture appears that switch reply but the ping packet has been dropped because of the crashed forwarding table .

screen

I have read some thing about this error here https://github.com/CPqD/ofsoftswitch13/issues/15 , but I did not understand or I do not know if it is the same states.

I look forward to hearing you thanks in advance

AbdullahAlshraa commented 7 years ago

At almost all of paths, there is no problem when sending ping packets in normal conditions capture

However, I modified the set link down button to change the link status between up and down every 10 seconds as the follow

def set_link_down(self,node1,node2, faults,fault,fault_ID):
        if(node1 > node2):
            node1,node2 = node2,node1

        X =int(input('Please enter the number of rounds: '))
        for i in range(X):
            os.system('sudo ifconfig s'+str(node1)+'-eth'+str(self.ports_dict['s'+str(node1)]['s'+str(node2)])+' down')
            os.system('sudo ifconfig s'+str(node2)+'-eth'+str(self.ports_dict['s'+str(node2)]['s'+str(node1)])+' down')
            print "link between",str(node1), "and",str(node2),"is down"
            print (time.strftime("%I:%M:%S"))
            time.sleep(10)
            os.system('sudo ifconfig s'+str(node1)+'-eth'+str(self.ports_dict['s'+str(node1)]['s'+str(node2)])+' up')
            os.system('sudo ifconfig s'+str(node2)+'-eth'+str(self.ports_dict['s'+str(node2)]['s'+str(node1)])+' up')
            print "link between",str(node1), "and",str(node2),"is UP"
            print (time.strftime("%I:%M:%S"))
            time.sleep(10)
        print ' your experiment finished '
        print (time.strftime("%I:%M:%S"))

At the beginning, there were no problems in the normal or failure situation till around 100 ping packets. capture1

Unfortunately, the crash forwarding tables problem appeared again.

any advice will be appreciated

DavideSanvito commented 7 years ago

Hi Abdullah, have you tried to check the log of the crashed switch after the crash? It should be located in the /tmp/ folder and, if ofdatapath has been run with the max level of verbosity, you should get a hint about the possible problem. Can you check it? Best, Davide

AbdullahAlshraa commented 7 years ago

Hello, I am not sure If I caught that well I took the next Screenshot after the crashed Switch. Could you tell me, which folder did you mean?

capture

However, I explained two different situations The first one, Host 1 sent one Ping packet to host 22 in normal conditions (one packet is enough to make switch 22 crashed !!)

The second one, the crashed switch sent the same packet through all its ports repeatedly (I think you meant this situation), so it perhaps run with the max level of verbosity

DavideSanvito commented 7 years ago

Hi, try to look into the /tmp folder located in the root folder / of the file system. Best, Davide

AbdullahAlshraa commented 7 years ago

Thank you for your patience In regard to the first situation ( 1 to 22 )

I found /tmp folder in the root capture

I used "gedit" to open s22-ofd.log

capture1

and I also used "gedit" to open s22-ofp.log

capture2

you can find the whole file (s22-ofp.log ) here txt.txt


In regard to the second situation.

I found /tmp folder in the root capture3

I used "gedit" to open s24-ofd.log
capture4

you can find the whole file (s24-ofd.log ) here 123.txt

I also used "gedit" to open s24-ofp.log

capture5

you can find the whole file (s24-ofp.log ) here 1234.txt

Thanks in advance Abdullah