Ability to tail a file continuously even if the file is recreated on the target

fieldrndservices / libssh2-labview

A LabVIEW library for SSH client support via libssh2

Apache License 2.0

22 stars 2 forks source link

Ability to tail a file continuously even if the file is recreated on the target #31

Closed temin4u closed 3 years ago

temin4u commented 3 years ago

At the outset I would like to thank the team for such a wonderful toolkit. Thanks for sharing with the community.

I'm able to tail the file continuously by writing the tail command to the channel at a 1 second interval after going through the example., thanks to Chris for pointing me here. https://github.com/fieldrndservices/libssh2-labview/blob/main/src/Examples/Read-Execute-Print-Loop%20with%20a%20Raspberry%20Pi.vi

I'm facing an issue when the file is recreated at the server side. The read operation reads only the tail command that was written to the channel and stops reading the file data.

Creating new session and channel after closing the current ones does work. So it looks like when the original file was momentarily not available, Since it was renamed and backed up on the target side, the reference to it is lost.

I wanted to know if there is any better idea to get around this. I the lib has any specific flag for this means it would be helpful to take a informed decision rather than to develop logics to detect this scenario. Feedback on this would be greatly helpful.

Test continuous tail

Thanks, Temin

volks73 commented 3 years ago

At the outset I would like to thank the team for such a wonderful toolkit. Thanks for sharing with the community.

Your welcome! Thank you for the kind words.

I'm able to tail the file continuously by writing the tail command to the channel at a 1 second interval after going through the example., thanks to Chris for pointing me here. https://github.com/fieldrndservices/libssh2-labview/blob/main/src/Examples/Read-Execute-Print-Loop%20with%20a%20Raspberry%20Pi.vi

The "Read-Execute-Print-Loop with a Raspberry Pi" example has been heavily modified and updated since I provided a VI snippet for periodically reading the output of a tail command on a remote SSH server. Please checkout the latest revision on the main branch of this repository of the example. It would also be good to review the content of Issue #30, since troubleshooting that bug led to the revisions to the example.

You already found one bug with appending the EOL character to the command; however, I would recommend using only the Linefeed (\n) character instead of the EOL character. I think you are connecting to a remote SSH server on a Linux/UNIX machine, like a CompactRIO or Raspberry Pi while the LabVIEW code is running on a Windows computer. The EOL character from the String palette in LabVIEW will automatically determine the EOL (End-of-Line) sequence based on host machine's operating system. On Windows, the EOL sequence is \r\n (Carriage Return and Linefeed/newline) while on Linux/UNIX/macOS it is just \n (LineFeed/Newline). You maybe experiencing some issues with sending commands to a Linux/UNIX/macOS machine if the Carriage Return is included.

A small side note, I am not sure the "Move Position" to end is needed. If you are just appending data/text to a file, then appending with the "Write" function in LabVIEW is the default behavior and moving the file position is excessive.

I think a problem might be with the -f flag for the tail command. This appears to keep the tail command running and continuously displaying the content of file.txt to STDOUT, i.e. the terminal. However, every second (1000 ms) you are sending the tail -f command. If you run for 10 seconds, then you will have 10 instances of the tail -f command running, all writing the content to STDOUT (Standard Output), which is sent to the Windows 10 host over SSH and read in the Read loop. You should either drop the -f flag or remove periodically sending the tail command and just send the tail -f command once at the start of the VI or when a button is pressed on the front panel.

I think I might know the problem, but I need to know the order of events (question marks used to indicate steps that are not clear or unknown):

Remote SSH server machine started up
"file.txt" created?
LabVIEW VI snippet started
tail file.txt command sent
Read output for tail command for some time
"file.txt" closed?
"file.txt" renamed to "old_file.txt"?
New file created? (file.txt, file2.txt, etc.?)
Error with LabVIEW code

Is the file always named "file.txt"?

If you are using the tail -f command, then when file.txt is moved/renamed/rolled over, the tail -f command will result in unusual behavior because it is no longer "tailing" the original file. The LabVIEW code should be unaffected because it is just reading STDOUT and is not aware of the command that is writing to STDOUT on the server, but the tail -f command will no longer be writing any new content to STDOUT because it is still tailing the old file or it has encountered some error and stopped working (I cannot determine what would happen).

Some possible solutions (untested and assuming modifications following from the resolution of #30 to the original example used as a "template"):

Remove the -f flag and periodically send tail file.txt. This will avoid having to detect when a new file is created and should "just work" but you will be continuously issuing the tail command instead of just issuing the command once.
Use the logrotate application installed on nearly every Linux/UNIX/macOS server, including CompactRIOs and Raspberry Pis, with its copytruncate directive to "roll over" the file.txt. The copytruncate directive tells the logrotate application to keep file.txt open, copy the content of the file to a new file, and delete content (not the file) in file.txt. This will play nice with the -f flag for the tail command because the tailed file will remain open and still exist. A little bit more information can be found in this Q&A.
Implement a similar functionality to the COPYTRUNCATE directive within the process/application that is creating the file.txt in Step 2 above and controlling the new file/rollover process (Steps 6-8).

temin4u commented 3 years ago

You already found one bug with appending the EOL character to the command; however, I would recommend using only the Linefeed (\n) character instead of the EOL character. I think you are connecting to a remote SSH server on a Linux/UNIX machine, like a CompactRIO or Raspberry Pi while the LabVIEW code is running on a Windows computer. The EOL character from the String palette in LabVIEW will automatically determine the EOL (End-of-Line) sequence based on host machine's operating system. On Windows, the EOL sequence is \r\n (Carriage Return and Linefeed/newline) while on Linux/UNIX/macOS it is just \n (LineFeed/Newline). You maybe experiencing some issues with sending commands to a Linux/UNIX/macOS machine if the Carriage Return is included.

Got it, thanks for the feedback.

I think a problem might be with the -f flag for the tail command. This appears to keep the tail command running and continuously displaying the content of file.txt to STDOUT, i.e. the terminal. However, every second (1000 ms) you are sending the tail -f command. If you run for 10 seconds, then you will have 10 instances of the tail -f command running, all writing the content to STDOUT (Standard Output), which is sent to the Windows 10 host over SSH and read in the Read loop. You should either drop the -f flag or remove periodically sending the tail command and just send the tail -f command once at the start of the VI or when a button is pressed on the front panel.

Got it, thanks for the feedback. sending the 'tail -f' command only once worked fine.

I think I might know the problem, but I need to know the order of events (question marks used to indicate steps that are not clear or unknown):

Remote SSH server machine started up

"file.txt" created?-->yes

LabVIEW VI snippet started

tail file.txt command sent

Read output for tail command for some time

"file.txt" closed? --> I'm not sure how the linux app is handling this.

"file.txt" renamed to "old_file.txt"?-->yes

New file created? (file.txt, file2.txt, etc.?)-->yes, file.txt is always the latest one, every hour it is taking backup of the same by renaming and zipping it.

Error with LabVIEW code--> no error is generated on read and write functions if the file is renamed or deleted

Is the file always named "file.txt"?-->yes

If you are using the tail -f command, then when file.txt is moved/renamed/rolled over, the tail -f command will result in unusual behavior because it is no longer "tailing" the original file. The LabVIEW code should be unaffected because it is just reading STDOUT and is not aware of the command that is writing to STDOUT on the server, but the tail -f command will no longer be writing any new content to STDOUT because it is still tailing the old file or it has encountered some error and stopped working (I cannot determine what would happen).

yes, But after the original file is recreated, even if I write the same tail command to the channel, it is just echoed back, basically all the commands written after the interruption is just echoed back. So I believe the channel has got into a error state and not able to recover.

Some possible solutions (untested and assuming modifications following from the resolution of #30 to the original example used as a "template"):

Remove the -f flag and periodically send tail file.txt. This will avoid having to detect when a new file is created and should "just work" but you will be continuously issuing the tail command instead of just issuing the command once.

Use the logrotate application installed on nearly every Linux/UNIX/macOS server, including CompactRIOs and Raspberry Pis, with its copytruncate directive to "roll over" the file.txt. The copytruncate directive tells the logrotate application to keep file.txt open, copy the content of the file to a new file, and delete content (not the file) in file.txt. This will play nice with the -f flag for the tail command because the tailed file will remain open and still exist. A little bit more information can be found in this Q&A.

Implement a similar functionality to the COPYTRUNCATE directive within the process/application that is creating the file.txt in Step 2 above and controlling the new file/rollover process (Steps 6-8).

Thanks for your inputs on the possible solution paths. Since I'm more interested in the change data I went with tail -f, dropping the -f might fetch the same data again in consecutive reads so I didn't go with that. Also, I'm yet to try with the latest code with the timeout feature added, May be if I get a timeout from the channel I think I can safely close the current channel and create a new one, May I know your thoughts on this?

I'm planning to integrate the module with our project, can I use the source as it is or wait for VIPM release. Any additional info related to this will be great.

volks73 commented 3 years ago

yes, But after the original file is recreated, even if I write the same tail command to the channel, it is just echoed back, basically all the commands written after the interruption is just echoed back. So I believe the channel has got into a error state and not able to recover.

I agree that something has gone into an error state. However, I still think it might be the tail command and not the channel. I created the following steps/sequences to help me visualize the order of events. It might be helpful for you, too.

1. server  --> create  --> file.txt
2. channel --> tail -f --> file.txt
   server  --> writing --> file.txt
3. server  --> copy    --> file.txt --> old_file.txt
   server  --> delete  --> file.txt 
   server  --> create  --> file.txt <-- Not the same file as the original, just the same name
4. channel --> tail -f --> ???????? 
   server  --> writing --> file.txt <-- New file, not the original

Note, the Step 3 copy, delete is a breakdown of a "move"/mv command. At Step 4, the original tail -f is tailing/pointing to a non-existent file. This probably causes the tail command to enter an error state, which in turn causes the channel to echo commands.

Would it be possible instead of doing a "move" (copy/delete) to do a copy/truncate on the original file.txt? Once an hour, the file.txt is copied to old_file.txt, but file.txt is not deleted on disk, only the content of the file is deleted. This would be similar to setting the file position in LabVIEW to "start" or 0 with the seek function. This would keep the tail -f command pointing to an existing file on the server and avoid entering an error state. This would also avoid having to detect when this happened to create a new channel.

Can you elaborate a little more on the process/application that is creating the file.txt file on the server, writing to it, and backing up once an hour? Is this a LabVIEW application as well? Can you have it send a message to your LabVIEW application that the backup is occurring?

Also, I'm yet to try with the latest code with the timeout feature added, May be if I get a timeout from the channel I think I can safely close the current channel and create a new one, May I know your thoughts on this?

I think you may have an even easier mechanism for this. Regardless if truncating as mentioned above works or not, you appear to know that the file.txt file is backed up every hour. Can you add a timer to your LabVIEW code to stop and close the channel every hour?

I am not sure the timeout feature may help in this. There are a couple of VIs in the toolkit's Channel palette/class that could be of use: Read End-of-File.vi and Read All Stderr.vi. It is possible that when the original file.txt is backed up, the tail command may signal EOF (End-of-File). You can use the Read End-of-File.vi to see if the EOF signal has been set. This would indicate the channel should be closed and re-opened. Similarly, if the tail command does in fact generate an error, it may write the error message or some value to the STDERR stream. You can continuously read the STDERR stream, similar to the STDOUT stream, in a separate loop. If it is not empty, then restart/recreate the channel. STDERR is generally supposed to be empty unless there is an error, but some applications use it for logging and information messages. This might be good to do anyways in case any error occurs, you will have more information to help with debugging. I cannot commit to any availability or capacity, but I may be able to play around with this later this week.

I'm planning to integrate the module with our project, can I use the source as it is or wait for VIPM release. Any additional info related to this will be great.

You are free to integrate the toolkit with your project without having to wait for a new release. The changes for the next release are relatively minimal and you could re-implement in the mean time if needed. The changes are mostly focused on the examples, too. Only a small number of changes to very specific VIs that I don't think you are using have been modified. However, I might be able to push a new release today (2021-01-05). It will take a while to propagate to VIPM, so even if I was able to make a release today, it may not be soon enough for your project.

volks73 commented 3 years ago

@temin4u I published a new release (v1.1.0). I submitted the update to VIPM.io, but it will take some time for it to appear. In the mean time, you can download the VIP from the Releases page.

temin4u commented 3 years ago

Awesome, thanks for all your insights @volks73 , I will keep posting if I face any challenges during the integration.

temin4u commented 3 years ago

quick updates

In my sample code when I use the Read Stderr.vi in a parallel loop and use the same channel reference, LabVIEW crashes after some time.
Whenever there is a new file created in server, the read channel timeout returns TRUE. This should be helpful for me.
Read End-of-File.vi didn't work out also

volks73 commented 3 years ago

Thanks for the screenshot.

In my sample code when I use the Read Stderr.vi in a parallel loop and use the same channel reference, LabVIEW crashes after some time.

I was mistaken. You cannot read STDERR and STDOUT in parallel. While you can read and write in separate loops, you must do all of the reading for the channel in a single loop. You should be able to simply move the Read All Stderr VI to the Read Loop in your screen shot after the Read All VI. In other words, read STDOUT and STDERR serially as opposed to in parallel. If LabVIEW continues to crash when reading STDOUT then STDERR, then it is most likely a bug in the toolkit or C library and I will have to investigate further, but I suspect the crash is because of simultaneous access to the read "stream" of the channel.

Whenever there is a new file created in server, the read channel timeout returns TRUE. This should be helpful for me.

Great to hear! Sounds like you found a solution that will work for your use case. The Timed Out? indicator is new in v1.1.0, which is up on VIPM.io now and you should be able to easily integrate with your project.

Read End-of-File.vi didn't work out also

Thanks for the information and trying. I wasn't confident this would work, but I thought it was at least worth trying. The timeout returning TRUE looks to be the better solution for you.

I am going to leave this issue open a little longer so you can continue to post updates during integration, but this issue feels very close to resolved/complete.

temin4u commented 3 years ago

@volks73 Good day! Few observations and doubts for which I request your feedback,

In other words, read STDOUT and STDERR serially as opposed to in parallel.

I tried this but if I use Read All Stderr.vi after STDOUT(READ ALL.vi) then I'm not getting stuck on Read All Stderr.vi function, Basically it is not timing out.

On the contrary, Read Stderr.vi is timing out as as expected.

Is libssh2lv.dll not compatible with pharlap os? I tried copying libssh2lv.dll to ni-rt folder and provided the path explicitly to Initialize.vi but it returns error. I did go through https://github.com/fieldrndservices/libssh2-labview#dependencies-installation where support for windows and Linux is captured. Just want to know if there is a remote chance it will work with pharlap,
Once we do tail -f then is there a way to stop the tail service and break out of it similar to ctrl+c in putty ? At present, once we enter a tail -f command, then the channel can't be used for any other subsequent commands. I tried writing ETX but didn't work.

Thanks in advance

volks73 commented 3 years ago

I will have to look into the STDERR not working. I should probably put together an example for reading the contents of STDERR. Thanks for the information.

Is libssh2lv.dll not compatible with pharlap os? I tried copying libssh2lv.dll to ni-rt folder and provided the path explicitly to Initialize.vi but it returns error. I did go through https://github.com/fieldrndservices/libssh2-labview#dependencies-installation where support for windows and Linux is captured. Just want to know if there is a remote chance it will work with pharlap,

At the moment, I can only say macOS, Windows, and Linux (including NI Linux RT) are supported. I do not have much experience with Pharlap. Are you able to compile C code on your Pharlap device? I do not have a Pharlap device available to test. I believe Pharlap is being replaced with NI Linux RT. In other words, you will need to build the libssh2 and libssh2lv libraries from source on a Pharlap device. Although specific NI Linux RT, the following resources may be adaptable to compiling from source on Pharlap:

Setup a Build Environment on Embedded Hardware Running NI Linux RT
Build instructions for libssh2-nilrt-ipk: Instead of cmake --build . --target ipk, you would just use cmake --build . at the end since I don't think Pharlap uses IPK files.
Build instructions for libssh2lv-nilrt-ipk
C/C++ on Pharlap: It looks like it might be possible to compile C code on Pharlap, but it is not as straight-forward as NI Linux RT, and I have no personal experience with it.

Once we do tail -f then is there a way to stop the tail service and break out of it similar to ctrl+c in putty ? At present, once we enter a tail -f command, then the channel can't be used for any other subsequent commands. I tried writing ETX but didn't work.

CTRL+C is actually sending a "signal" to tail -f. I have to look into how to send the SIGINT through libssh2. I am not even sure it is possible if not running interactively with a shell/terminal like PuTTY. However, does tail -f stop if you send CTRL+Z (UNIX EOF)/CTRL+D (Windows EOF) using PuTTY? If yes, then we might be able to use the Send End-of-File VI to the tail -f command to terminate the tail application.

temin4u commented 3 years ago

I will have to look into the STDERR not working. I should probably put together an example for reading the contents of STDERR. Thanks for the information.

Sure thanks. Would be glad to help in testing if required.

In other words, you will need to build the libssh2 and libssh2lv libraries from source on a Pharlap device. Although specific NI Linux RT, the following resources may be adaptable to compiling from source on Pharlap:

Thanks, I understand the process. Initially I was wondering if the libssh2lv.dll would be compatible directly on pharlap but looks like it has lot of windows dll dependency and it needs to re-compiled based on the compatibility checker tool https://knowledge.ni.com/KnowledgeArticleDetails?id=kA00Z0000019M0tSAE&l=en-IN

CTRL+C is actually sending a "signal" to tail -f. I have to look into how to send the SIGINT through libssh2. I am not even sure it is possible if not running interactively with a shell/terminal like PuTTY.

yes seems like not possible in an non-interactive session. closing the current channel and creating a new one seems to be the solution based on some online research

"How to send SIGINT (Ctrl-C) to current remote process over SSH , If you use ssh to start a process without a PTY on a remote system, then as far as I can tell, there's no way to signal the remote process through There's no Ctrl+C in non-interactive session (get_pty=False). And you do not need it, the command is closed with the session: ssh.close() If you want to keep the session (e.g. for another command), close only the command channel: stdin, stdout, stderr = ssh.exec_command('python movemotor.py', get_pty=False) stdout.channel.close()" https://www.xspdf.com/resolution/53636262.html https://stackoverflow.com/questions/44348083/how-to-send-sigint-ctrl-c-to-current-remote-process-over-ssh-without-t-optio

However, does tail -f stop if you send CTRL+Z (UNIX EOF)/CTRL+D (Windows EOF) using PuTTY? If yes, then we might be able to use the Send End-of-File VI to the tail -f command to terminate the tail application.

CTRL+Z did stop the 'tail -f' in PuTTY, but it is not EOF signal as we assumed. It is used for suspending a process by sending it the SIGSTOP signal. The below article covers this and I verified the same.

https://superuser.com/questions/262942/whats-different-between-ctrlz-and-ctrlc-in-unix-command-line

volks73 commented 3 years ago

@temin4u Wow, some great information, thank you!

I don't think the libssh2lv DLL uses a lot of Windows-specific calls. It is just a wrapper around the LIBSSH2 library with an API more suitable for integration with LabVIEW, but it does statically link the LIBSSH2 and OpenSSL libraries, which may have a bunch of Windows-specific calls that prevent just copying the Windows DLL over to the Pharlap system.

On NI Linux RT, the libssh2 and OpenSSL libraries are dynamically linked and the OpenSSL library is part of the "distribution" installed on cRIOs, PXIs, etc. NI Linux RT does not have LIBSSH2 installed as part of the distribution and it is not available from NI's opkg repository. I would imagine Pharlap is similar. It may have OpenSSL, but the LIBSSH2 library would still need to be compiled and available as a DLL or statically linked into the libssh2lv DLL. Compiling OpenSSL can be very error prone and troublesome to the point if it is not already available on the system, it may not be worth it. Side note, OpenSSL is not a "hard" dependency. There are other crypto libraries that can be used with libssh2. On Windows, and possibly Pharlap, there is the WinCNG crypto library that can be used to build LIBSSH2 instead of OpenSSL.

It looks like providing Pharlap support might be still doable but it is not straight-forward and a time sink, sorry.

CTRL+Z did stop the 'tail -f' in PuTTY, but it is not EOF signal as we assumed. It is used for suspending a process by sending it the SIGSTOP signal. The below article covers this and I verified the same.

Thanks for verifying CTRL+Z usage. Did you try CTRL+D from within in PuTTY? I may have missed quoted/typed. I get the various CTRL key combinations in terminals mixed up all the time. Even if CTRL+D works, based on the information you provided, I agree that the only consistent and "sure fire" way to terminate the tail -f command through SSH is to close the channel. You probably do not need to disconnect and destroy the session, just the channel.

volks73 commented 3 years ago

I am going to close this issue because it looks like the original question has been addressed. If you would like to continue discussing Pharlap support, then can a new issue/question be started?

temin4u commented 3 years ago

Hi @volks73 ,

This ticket may be closed. I have another query, will open a ticket for the same.

Thanks