gilestrolab / ethoscope

a platform from monitoring animal behaviour in real time from a raspberry pi
http://lab.gilest.ro/ethoscope/
GNU General Public License v3.0
17 stars 25 forks source link

Accessing Video Data #146

Closed zzaidi148 closed 3 years ago

zzaidi148 commented 3 years ago

Hello, please excuse the naive nature of my question.

I was just wondering how I could access the video results data from the node or ethoscope. I am not entirely sure my video backup service is working. I am using an RPI3 as ethoscope and RPI4 as node. I can access the db file for my experiment but do not know where to find the h264 or mp4 results. I have 128gb sd cards attached to everything so I don't think storage would be a problem for the backup service or retrieval in general. I am just unaware of how to access the video directory.

antortjim commented 3 years ago

Dear @zzaidi148

The videos are recorded to a folder on each ethoscope following this scheme:

/ethoscope_data/results/<machine_id>/<ethoscope_name>/<date_time>/

for instance

/ethoscope_data/results/014aad42625f433eb4bd2b44f811738e/ETHOSCOPE_014/2020-05-09_12-00-07

in those folders you will find lots of .h264 videos lasting 5 minutes each and named in a machine-friendly (and also human friendly) way so you know their chronological ordering as well as the particular ethoscope and date they were produced in

If your video backup service is running, you should get some output when you run this on the CLI

 systemctl status ethsocope_video_backup

The folders mentioned above should be "mirrored" in the node with the following structure

./<machine_id>/<ethoscope_name>/<date_time>/

under /ethoscope_data/videos/ (or whatever folder you specified in /etc/ethoscope.conf).

Otherwise, you need to activate the service. Giorgio's lab wrote the systemd service unit file already for us and is located under /opt/ethoscope-node/scripts/ethoscope_video_backup.service. One way to get your system to run this service in the background is to copy this file to /etc/systemd/system and enable it

 cp /opt/ethoscope-node/scripts/ethoscope_video_backup.service /etc/systemd/system
 systemctl daemon-reload # refresh database of services
 systemctl start  ethoscope_video_backup # start the service
 systemctl status ethoscope_video_backup # see if the service works
 journalctl -ru ethoscope_video_backup # more details on whether is working or not
 systemctl enable ethoscope_video_backup # this tells systemd to run the start command for you everytime upon boot, effectively running the service always in the background without the user specifying it, like the ethoscope db backup 

Let me know if something is not clear!

zzaidi148 commented 3 years ago

Thank you @antortjim for your comprehensive response. I really appreciate it.

So following the steps you detailed, I can report that my video backup service is active and running on my RPI4 node:

IMG_1651

I, however, am just not well-versed enough in commands to extract the video files I need from the node or the ethoscopes. Since there is no user interface, I cannot just open the filesystem and view the videos. So, I am wondering what method I could use to get to the videos and be able to view and store them for separate analysis. Should I use nano /ethoscope_data/results/014aad42625f433eb4bd2b44f811738e/ETHOSCOPE_001/2021-02-04_12-00-00? I tried that and it opened a directory that had nothing in it. I'm sorry if my nascent CLI knowledge is embarrassingly simple. Or am I supposed to be seeing the video files under Management in the node's interface (where the .db and .txt files can be accessed)?

I sincerely appreciate all your help!

ghost commented 3 years ago

Hello,

We are having the same issue with downloading videos. We can see them saved on the node, but are struggling to find them in the web interface. Is that a thing, or is it better to transfer with an external ssd?

Thanks, -Daniel Harvey

zzaidi148 commented 3 years ago

Hey @Nocturtle ,

If you can see your video chunks saved on the node, then all you need to do from there is mount your external hard drive to your node and let the video service backup automatically save your files there(follow the instructions on the Notion page). I do not think video recordings end up on the node interface. I in fact am confused as well! I was assuming that the command "Start Tracking" would produce a .txt, .db, and video files .h264. However, all the video chunks recorded on the node all came from the command "Record Video". Therefore, I am assuming whether the two commands can be issued simultaneously to acquire tracking information as well as the video of that tracking experiment. Perhaps this is what the API is for? @antortjim

antortjim commented 3 years ago

Watching videos backed up to the node

As far as I know, the videos cannot be viewed from the GUI, even if they're saved to the node. That sounds like a problem if your node does not have any GUI at all i.e. if it's just a CLI. But there's alternatives! You have 3 options to move forward, from hard to easiest:

  1. Open the video from the node CLI directly. Some programs can render graphic content within a CLI. I think mpv is one of those. You can give it a try and install it in Arch. Probably just a pacman -Sy mpv should suffice. Here are the mpv keybindings https://wiki.archlinux.org/index.php/Mpv#Key_bindings

  2. Stream the video from the node to your destination computer. I know mpv can do this:

mpv sftp://NODE_ADDRESS:/ethoscope_data/videos/XXXX/XXXX/XX/XX.h264 replace NODE_ADDRESS with node or 192.169.123.1 or whatever IP address the node has

  1. Easiest of all I think is downloading the video to your computer. This can be done with CLI programs like scp or rsync. I suggest starting with FileZilla, a GUI that does the same thing. You need to

    1-Install FileZilla for your platform https://filezilla-project.org/ on your pc (not the node). 2-Open it and provide Host (again NODE_ADDRESS), username (node) password (node) and port (leave it empty or maybe try 22). You may have to prepend the NODE_ADDRESS with sftp://. 3-Quick connect and save this conf, so next time you don't need to enter all these fields again. This program then works like a file explorer on two computers simultaneously, and allows you to swap files between them, effectively up/downloading files between your computer and the node.

You can of course just get a USB stick and copy all the videos there. In order to copy a folder with everything in it in a Linux CLI, you would issue

cp -r SOURCE DEST where SOURCE is /ethoscope_data/videos/XXX/XXX/XXX and DEST is whichever path the USB is mounted to. You should be able to figure out what DEST is by running df -h and looking for a partition with size equal to your USB stick and then looking in the "Mounted on" column.

Tracking and recording at the same time.

This is the exact same question I asked myself when I started with the ethoscopes. Unfortunately, this is not possible with the RPis because they don't have the power to record (save all the frames to the SD card as a video) and track (run the computer vision algorithm and save the results as an SQLite database to the SDCard). You can do only one of the two:

  1. Recording a video allows you to analyze the results offline and with different programs. But you need to analyze the videos to get anything meaningful!
  2. Tracking gives you a result live. But if the experiment fails, or if you want to run a different Computer Vision algorithm, you need to repeat the experiment...

Fortunately, there is a hack around this. The SQLite .db file contains a table called IMG_SNAPSHOTS which saves a frame every 300 seconds by default, for debugging and documentation purposes I presume. You can change the frequency of this frame saving, theoretically to even less than a second, so you would get all the frames saved there (our ethoscopes run at ~2FPS). But as I said, it's just not possible to save everything because saving one frame actually takes a significant amount of time. If you save every frame, your FPS is reduced, in my experience, to < 1 FPS (the RPi would spend more time saving the frame than running the Comp Vis on it). Needless to say, saving every frame would also cause your .db file to explode in size (MBs to GBs) and could blow up your SD card in a few days. In practice, you can set this saving frequency to minimum 10 seconds. Your mileage may vary. This hack is not useful for offline analysis. I never tried it, but I think 1/10 FPS is too little. But it is very useful if you want to get a better picture of how the ethoscope works i.e. the input -> output relationship. This 300 seconds number can be changed here https://github.com/gilestrolab/ethoscope/blob/0521648f33103b8055342aea3ef37c06fde150a1/src/ethoscope/utils/io.py#L253

On an ethoscope, you can run this command to change 300 to 10 sed -i 's/period=300.0/period=10.0/g' /opt/ethoscope-device/src/ethoscope/utils/io.py Confirm by grepping the python file grep "period=10.0" /opt/ethoscope-device/src/ethoscope/utils/io.py

Important: For this change to apply, you need to either reboot the ethoscope or restart the ethoscope service (sudo reboot or sudo systemctl restart ethoscope_device).

If you want to be able to do offline analysis, you would better record a video and analyze it on the node. There is no documentation for this that I am aware of, but it is definitely doable using the device_server.py script here https://github.com/gilestrolab/ethoscope/blob/master/src/scripts/device_server.py I recommend you start the video during the L phase because otherwise the targets might not be visible for the Comp Vis. Otherwise, you may have to provide the coordinates manually.

antortjim commented 3 years ago

You can view the frames in the IMG_SNAPSHOTS table with a GUI. The best one by far is https://sqlitebrowser.org/

  1. Open the .db file with sqlitebrowser
  2. Go to Browse Data > Select the IMG_SNAPSHOTS on the Table: dropdown menu
  3. The time when the frame was taken is saved in the t field in ms since the experiment start, and the frame is saved under img.

image

ggilestro commented 3 years ago

Thanks @antortjim for the nice explanation!

Rergarding the video chunks, this is our procedure:

  1. we use the ethoscope_video_backup service on the node to collect videos from the ethoscopes. It is reccomended to use a wired connection for this scope, particularly so if you have many videos.
  2. once the files are on the node, we use the python script pasted below to convert the chunks to mp4.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
#  h264TOmp4.py
#  
#  Copyright 2020 Giorgio Gilestro <giorgio@gilest.ro>
#  
#  This program is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.
#  
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#  
#  You should have received a copy of the GNU General Public License
#  along with this program; if not, write to the Free Software
#  Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
#  MA 02110-1301, USA.
#  
#  

from glob import glob
import os
from optparse import OptionParser

def process_video (folder, verbose=True):
    '''
    process video in folder
    '''

    #try:
    #move to folder
    os.chdir(folder)

    #prepare filenames
    with os.popen("ls *.h264 | head -n 1 | cut -d . -f 1") as cmd:
        prefix = "whole_%s" % cmd.read().rstrip()

    #calculate fps
    with os.popen("ls *.h264 | head -n 1 | cut -d _ -f 5 | cut -d @ -f 2") as cmd:
        fps = cmd.read().rstrip()

    tmp_file = "%s.tmp" % prefix
    filename = "%s.mp4" % prefix

    #merge files in one big chunk
    os.system( "cat *.h264  > %s" % tmp_file)

    os.system("ffmpeg -r %s -i %s -vcodec copy -y %s -loglevel panic" % ( fps, tmp_file, filename ) )

    os.system ("rm %s" % tmp_file)

    if verbose: print ("succesfully processed files in folder %s" % folder) 

    #except:

     #   return False

def list_mp4s (root_path):
    '''
    returns a list of folders that contains a mp4 file
    '''

    all_folders = [ x[0] for x in os.walk(root_path) ]
    have_mp4s = [p for p in all_folders if glob(os.path.join(p, "*.mp4"))]

    return have_mp4s

def crawl (root_path):
    '''
    crawl all terminal folders in root_path
    '''

    all_folders = [ x[0] for x in os.walk(root_path) ]

    have_mp4s = [p for p in all_folders if glob(os.path.join(p, "*.mp4"))]
    terminal_folders = [p for p in all_folders if glob(os.path.join(p, "*.h264"))]

    for folder in terminal_folders:
        if folder not in have_mp4s:
            process_video (folder)

if __name__ == '__main__':

    parser = OptionParser()
    parser.add_option("-p", "--path", dest="path", default="/ethoscope_data/videos", help="The root path containing the videos to process")
    parser.add_option("-l", "--list", dest="list", default=False, help="Returns a list of folders containing mp4 files", action="store_true")

    (options, args) = parser.parse_args()
    option_dict = vars(options)

    if option_dict['list']:
        l = list_mp4s (option_dict['path'])

        print ("\n".join(l))
        print ("Found %s folders with mp4 files" % len(l))
        os.sys.exit()

    crawl( option_dict['path'] )

The script is used with the following syntax:

./process_videos.py -p /ethoscope_data/videos

Hope this helps!

zzaidi148 commented 3 years ago

Thank you @ggilestro @antortjim for your amazing support.

I do have a few questions though, given my inexperience with linux and coding in general.

  1. @ggilestro to run the h264TOmp4.py python script, is there a specific directory I need to cd into before running the script? If so, I have not found any files names matching h264TOmp4.py in node_src/scripts. OR am I supposed to create a new python script named h264TOmp4.py on my node and run it after copying the script you mentioned in your earlier response into the python script?
  2. @antortjim @ggilestro I was wondering what other tracking software you guys use to quantify your movement experiments. I have been trying to use the rethomics R package to analyze my results, however, I have been unable to continue since I am having trouble linking my metadata and data despite meticulously following the tutorial posted by @qgeissmann with his toy data.
  3. Lastly, @antortjim I was wondering how you generated this gif and png(movement frequency) file to show your results. 61739052-f68ce900-ad8b-11e9-9c67-604078f097a0 61739340-86cb2e00-ad8c-11e9-9556-79f697f2c48e

I am incredibly appreciative of your unwavering support and cannot thank you enough for your help.

ggilestro commented 3 years ago

The script is in the folder /opt/ethoscope-node/scripts/tools/ The default name is actually process_all_h264.py

zzaidi148 commented 3 years ago

Seems that ffmpeg is not working or given permission: sh: whole_2021-02-09_19-55-42_001e39a6e1a1497489032a8e83159688__1280x960@25_00001.tmp: Permission denied sh: ffmpeg: command not found rm: cannot remove 'whole_2021-02-09_19-55-42_001e39a6e1a1497489032a8e83159688__1280x960@25_00001.tmp': No such file or directory succesfully processed files in folder /ethoscope_data/videos/001e39a6e1a1497489032a8e83159688/ETHOSCOPE_001/2021-02-09_19-55-42 sh: whole_2021-02-09_21-44-20_001e39a6e1a1497489032a8e83159688__1280x960@25_00001.tmp: Permission denied sh: ffmpeg: command not found Also @ggilestro this is probably a stupid question but is the .txt file that accompanies the .db files in the results, the index file itself?

Thank you!

pepelisu commented 3 years ago

It looks like you did not run it as sudo and then you are not having access to the files. did you try sudo python /opt/ethoscope-node/scripts/tools/process_all_h264.py The txt file that is next to the .db file is the equivalent of a dam system file, there is not positions there but just counts of virtual middle tube crossings.

zzaidi148 commented 3 years ago

Thank you @pepelisu! So I am currently having trouble linking my data and metadata(mostly because I am unaware of where I can find each). Based on the documentation, I am assuming that all the files needed to link the data and metadata are in the .db file since there seems to be a metadata section as well as the dam_activity section in it. If this assumption is correct, I suppose my next step then would be to try and turn that embedded metadata into a .csv file and leave the dam_activity as is to function as the "data". I am still however, unaware of what the index files is and how I can obtain it. I am incredibly apologetic if my nascent understanding of all of this is blatantly apparent and embarrassing. I just feel so close yet so far from analyzing and achieving the visuals I need. Thank you!

antortjim commented 3 years ago

Tracking software I think one of the best Python+OpenCV trackers out there is the ethoscope's. It's very nice because it implements a simple yet powerful model where the model "learns" how the background looks like. And being just Python+OpenCV it is very lightweight, which makes it easy to run on a RPi There are better trackers that make use of deep learning approaches, but these are way more computationally intensive and require running them on a GPU. You could try SLEAP or DeepLabCut or other trackers in the literature. I haven't analyzed ethoscope videos with a tracker other than the ethoscope tracker.

Metadata confusion

There's two things called metadata in the ethoscope environment

1) Every time you run an experiment, you should create a metadata table where every row represents a fly loaded in an ethoscope. Every row (fly) needs thus to have information about WHEN and WHERE it was loaded. This is achieved with the fields date (when) and machine_name and region_id (which machine and where in the machine). These three fields thus allow you to identify an exact animal in time and space. But you will hardly ever create metadata with only these 3 columns. You want to add more columns that keep track of the genotype, gender, age, treatment or whatever thing that may be relevant for the result of the experiment. You would then add the corresponding columns to your table

https://rethomics.github.io/metadata.html

This file serves two purposes:

R only needs this file and the path to your "ethoscope database" in the node to "link" your metadata. A linked metadata is the same as before plus a field that contains the path of the corresponding .db file

2) The .db file itself contains a METADATA table. It is also experiment metadata, but not regarding the flies. Instead, it tells you information about the machine where the data was produced and the settings you passed when you started your experiment via the GUI. You can see it contains the frame size used for tracking the flies, the version of the git repository (for reproducibility), etc. The last row contains a json with the settings you passed on the GUI. You can find there for instance the SD schedule you provided (if any).

Plot

I generated the plot over there to look at the raw data i.e. the actual fly positions along the tube (y axis) over time (x axis), and not the result of the behavioral scoring (sleep annotation). I did it because I needed to debug what the h*** was going wrong in my ethoscopes that was causing the data to be garbage. The plot shows some spikes where the fly seems to jump to one end of the tube and then back to exactly where it was a moment ago, which is obviously an impossible behavior. That indicated issues with the tracking, and led me to discover the background light was flickering and causing spurious changes all over the frames,. Especially at the edges because those are the areas more sensitive to differences in lighting.

If you want to create such a plot, this should be the code (there might be a typo)

library(behavr)
library(ggetho)
library(scopr)

# assumming metadata is a linked data.table
# load the data linked to the metadata
dt_raw <- scopr::load_ethoscope(metadata, verbose=TRUE)

# do this if you have data for more than a few hours, otherwise skip it
dt_raw <- dt_raw[seq(1, nrow(dt), 100),]

# this might take a while to render if you have a lot of data (if nrow(dt_raw) gives you more than a million)

# This code renders the normal sleep trace plot
ggetho(dt_raw, aes(y=asleep)) + stat_pop_etho() + scale_x_hours() + stat_ld_annotations()
# This one renders the plot from above
ggplot(dt_raw, aes(x=t, y=x)) + geom_line() + scale_x_hours() + stat_ld_annotations()

Gif

This gif and the other one I posed on the issue are also the result of me debugging the ethoscopes. I think I did this one just by simply having my browser refresh and save the feed from the ethoscope as they come (every 5 seconds, I think).

Much better to do this minimal change:

Add below this line on an ethoscope https://github.com/gilestrolab/ethoscope/blob/0521648f33103b8055342aea3ef37c06fde150a1/src/ethoscope/core/monitor.py#L129

cv2.imwrite(f"/root/drawn_frame_{self._last_frame_idx}.png", drawer.last_drawn_frame)

Reboot the ethoscope and start a TRACKING experiment.

Then open a terminal in the node and run

scp root@ETHOSCOPE_IP:/root/drawn* .

this will give you a lot of drawn frames that you can look at.

If root is not allowed to ssh/scp, you get this message:

scp: /root/drawn*: Permission denied

just stop the ethoscope and get the sdcard.

Dont run this experiment for more than a few minutes, otherwise you will have lots of frames. And dont do this in a real experiment. Your FPS will be tiny. Do it only for documentation or debugging.

The frames won't contain the time annotation, let me know if you also need that (we need 1/2 more lines then).

Data

The tracking data is contained in the sqlite .db file. The txt is just an emulation of what a DAM would give you in the same experiment. But we never use it, in fact, you could remove it and you would still be able to analyze your experiment with the same code. In order to analyze the experiment you need 3 files: 1) .db file with the result of the experiment 2) .csv (or another table format) with the metadata (fly info) of the experiment 3) An R script to put 1 and 2 together and actually do some analytics.

The .db file cannot be saved 'wherever', it must be stored inside the folder you pass to scopr::link_ethoscope_metadata and inside that, in a folder tree with the following default structure SOMEFOLDER/machine_id/machine_name/datetime/datetime_machine_id.db. This is because link_ethoscope_metadata looks it for you inside SOMEFOLDER. But this should not be a problem since the ethoscope backup service saves the file with this structure for you.

zzaidi148 commented 3 years ago

You are an absolutely awesome human being @antortjim! Thank you for your extensive clarification.

Now, with the terminology clarified, I have tried using the tutorial's commands to begin to link my data and metadata and generate the plots. Below are the commands I've issued in RStudio(all rethomics packages have been installed) and their responses. I have generated my own metadata.csv that looks like this if you need it. ` DATA_DIR <- "/Users/Kramerlab/Downloads/ethoscope_data"

list.files(DATA_DIR) [1] "metadata.csv" "results"
setwd(DATA_DIR) library(scopr) metadata <- fread("metadata.csv") metadata machine_name date region_id treatment replicate 1: ETHOSCOPE_001 2/8/21 1 no 1 2: ETHOSCOPE_001 2/8/21 2 no 1 3: ETHOSCOPE_001 2/8/21 3 no 1 4: ETHOSCOPE_001 2/8/21 4 no 1 5: ETHOSCOPE_001 2/8/21 5 no 1 6: ETHOSCOPE_001 2/8/21 6 no 1 7: ETHOSCOPE_001 2/8/21 7 no 1 8: ETHOSCOPE_001 2/8/21 8 no 1 9: ETHOSCOPE_001 2/8/21 9 no 1 10: ETHOSCOPE_001 2/8/21 10 no 1 11: ETHOSCOPE_001 2/8/21 11 no 1 12: ETHOSCOPE_001 2/8/21 12 no 1 13: ETHOSCOPE_001 2/8/21 13 no 1 14: ETHOSCOPE_001 2/8/21 14 no 1 15: ETHOSCOPE_001 2/8/21 15 no 1 16: ETHOSCOPE_001 2/8/21 16 no 1 17: ETHOSCOPE_001 2/8/21 17 no 1 18: ETHOSCOPE_001 2/8/21 18 no 1 19: ETHOSCOPE_001 2/8/21 19 no 1 20: ETHOSCOPE_001 2/8/21 20 no 1 21: NA NA 22: NA NA 23: NA NA 24: NA NA 25: NA NA 26: NA NA 27: NA NA 28: NA NA 29: NA NA 30: NA NA 31: NA NA

UNTIL 80

metadata[,date := fastPOSIXct(date, tz='UTC')] ##TO RESOLVE THE PARSE_DATE_TIME UNEXCPECTED ISSUE

metadata <- link_ethoscope_metadata(metadata, result_dir = "results") There were 50 or more warnings (use warnings() to see the first 50) warnings() Warning messages: 1: In build_query(result_dir, query, index_file) : No result for machine_name == , date == NA and time == NA. Omiting query 2: In build_query(result_dir, query, index_file) : No result for machine_name == , date == NA and time == NA. Omiting query 3: In build_query(result_dir, query, index_file) : No result for machine_name == , date == NA and time == NA. Omiting query 4: In build_query(result_dir, query, index_file) : No result for machine_name == , date == NA and time == NA. Omiting query 5: In build_query(result_dir, query, index_file) : No result for machine_name == , date == NA and time == NA. Omiting query

UNTIL 50`

I also get this when retrying this command:

> metadata <- link_ethoscope_metadata(metadata, result_dir = "results")
Error in check_columns(c("machine_name", "date"), query) : 
  The following columns are needed, but not found: date

Here is how my file tree is looking like:

image

I know this is probably an issue worth mentioning in scopr, but since most support is coming via this repository I think leaving it here will be fine.

pepelisu commented 3 years ago

You got it all almost correct. The problem here is your .csv file. the field data should be a text cell with the date in the following format: "YYYY-MM-DD" for example "2021-02-24" that will look up for the .db file that has that date on the filename, if there is two of them for the same ethoscope (experiments started consecutively but of short duration) , it will take the last one. You can also specify the hour of the experiment if you do not want to load the last one. `"YYYY-MM-DD-HH:MM" Be aware that excel save the date as a proprietary datetime format when it recognizes that the input resembles a data. To avoid that, simple format the column date to "text".

zzaidi148 commented 3 years ago

Thank you @pepelisu. I have adjusted my metadata.csv file yet am still greeted with similar issues: `metadata <- link_ethoscope_metadata(metadata, result_dir = "results") There were 20 warnings (use warnings() to see them)

warnings() Warning messages: 1: In build_query(result_dir, query, index_file) : No result for machine_name == ETHOSCOPE_001, date == 2021-02-08 and time == NA. Omiting query 2: In build_query(result_dir, query, index_file) : No result for machine_name == ETHOSCOPE_001, date == 2021-02-08 and time == NA. Omiting query 3: In build_query(result_dir, query, index_file) : No result for machine_name == ETHOSCOPE_001, date == 2021-02-08 and time == NA. Omiting query 4: In build_query(result_dir, query, index_file) : No result for machine_name == ETHOSCOPE_001, date == 2021-02-08 and time == NA. Omiting query 5: In build_query(result_dir, query, index_file) : No result for machine_name == ETHOSCOPE_001, date == 2021-02-08 and time == NA. Omiting query 6: In build_query(result_dir, query, index_file) : No result for machine_name == ETHOSCOPE_001, date == 2021-02-08 and time == NA. Omiting query 7: In build_query(result_dir, query, index_file) : No result for machine_name == ETHOSCOPE_001, date == 2021-02-08 and time == NA. Omiting query 8: In build_query(result_dir, query, index_file) : No result for machine_name == ETHOSCOPE_001, date == 2021-02-08 and time == NA. Omiting query 9: In build_query(result_dir, query, index_file) : No result for machine_name == ETHOSCOPE_001, date == 2021-02-08 and time == NA. Omiting query 10: In build_query(result_dir, query, index_file) : No result for machine_name == ETHOSCOPE_001, date == 2021-02-08 and time == NA. Omiting query`

This is what my metadata file is looking like: `library(scopr)

metadata <- fread("metadata.csv") metadata machine_name date region_id treatment replicate 1: ETHOSCOPE_001 2021-02-08 1 no 1 2: ETHOSCOPE_001 2021-02-08 2 no 1 3: ETHOSCOPE_001 2021-02-08 3 no 1 4: ETHOSCOPE_001 2021-02-08 4 no 1 5: ETHOSCOPE_001 2021-02-08 5 no 1 6: ETHOSCOPE_001 2021-02-08 6 no 1 7: ETHOSCOPE_001 2021-02-08 7 no 1 8: ETHOSCOPE_001 2021-02-08 8 no 1 9: ETHOSCOPE_001 2021-02-08 9 no 1 10: ETHOSCOPE_001 2021-02-08 10 no 1 11: ETHOSCOPE_001 2021-02-08 11 no 1 12: ETHOSCOPE_001 2021-02-08 12 no 1 13: ETHOSCOPE_001 2021-02-08 13 no 1 14: ETHOSCOPE_001 2021-02-08 14 no 1 15: ETHOSCOPE_001 2021-02-08 15 no 1 16: ETHOSCOPE_001 2021-02-08 16 no 1 17: ETHOSCOPE_001 2021-02-08 17 no 1 18: ETHOSCOPE_001 2021-02-08 18 no 1 19: ETHOSCOPE_001 2021-02-08 19 no 1 20: ETHOSCOPE_001 2021-02-08 20 no 1 metadata[,date := fastPOSIXct(date, tz='UTC')]

metadata <- link_ethoscope_metadata(metadata, result_dir = "results") There were 20 warnings (use warnings() to see them)`

Or could it be because I don't have access to an index file? I am still a bit confused as to what that is and how I can gain access to it. Thank you for your efforts as always!

antortjim commented 3 years ago

You need to make sure before linking the metadata of two things:

1) The class of the date field is character. Run class(metadata$date) to make sure. If it returns something else, the R code is likely to fail. I have experienced (it depends on the R and data.table versions probably) that data.table::fread automatically formats the date field to something called IDate, which looks identical to an R primitive character but causes a fail in the code. It can be fixed by running metadata$date <- as.character(metadata$date) after importing the metadata with fread.

2) Your ethoscope database is exactly available at the folder that results from running file.path(getwd(), "results") (results being what you passed to link_ethoscope_metadata. Under the results folder there should be a machine_id/machine_name/datetime/machine_id-datetime.db file. Otherwise the linking fails. I am calling ethoscope database the ensemble of .db files you have under results. But a single .db file can also be considered an ethoscope database, even though it really is just the data from one experiment. You can verify this by running list.files("results", recursive=TRUE)

I don't know what is the index file you mention.

zzaidi148 commented 3 years ago

This is what I mean by index file, I'm assuming:

Screen Shot 2021-02-24 at 11 04 09 AM
antortjim commented 3 years ago

Oh I see, this might be a new thing that I don't know. We don't have this file and we don't need it for now at least

zzaidi148 commented 3 years ago

You need to make sure before linking the metadata of two things:

  1. The class of the date field is character. Run class(metadata$date) to make sure. If it returns something else, the R code is likely to fail. I have experienced (it depends on the R and data.table versions probably) that data.table::fread automatically formats the date field to something called IDate, which looks identical to an R primitive character but causes a fail in the code. It can be fixed by running metadata$date <- as.character(metadata$date) after importing the metadata with fread.
  2. Your ethoscope database is exactly available at the folder that results from running file.path(getwd(), "results") (results being what you passed to link_ethoscope_metadata. Under the results folder there should be a machine_id/machine_name/datetime/machine_id-datetime.db file. Otherwise the linking fails. I am calling ethoscope database the ensemble of .db files you have under results. But a single .db file can also be considered an ethoscope database, even though it really is just the data from one experiment. You can verify this by running list.files("results", recursive=TRUE)

I don't know what is the index file you mention.

Thank you @antortjim it seems that the metadata date was being read as 'Idate' and not 'date'. It is now linked and generating plots! Thank you so much!!!! One question I did have though was about generating the sleep plot I asked you about earlier. I am able to generate a plot, but am not sure if it is an aggregation of all the flies I have or if it is just one fly. I followed your instructions but am wondering how I can get the indiviual sleep plot graphs for each fly as you did in this figure:

# This code renders the normal sleep trace plot
ggetho(dt_raw, aes(y=asleep)) + stat_pop_etho() + scale_x_hours() + stat_ld_annotations()
# This one renders the plot from above
ggplot(dt_raw, aes(x=t, y=x)) + geom_line() + scale_x_hours() + stat_ld_annotations()

My plot:

Screen Shot 2021-02-24 at 11 49 24 AM

Desired plot:

pepelisu commented 3 years ago

To do that you need something like this:

ggplot(dt_raw, aes(x=t, y=x, colour=id)) + 
geom_line() + scale_x_hours() + 
stat_ld_annotations() + 
facet_grid(id ~ .)

You need to add the variable colour to the aes (aesthetics) and facet_grid for the grid distribution.

Check the documentation of facet_grid here I would recommend to take a look to the rethomics tutorial and documentation to see the options that you have for plotting and analysing the data. rethomics In particular de section dedicated to sleepr or ggetho About the issue with IDate, since the last version of data.table as.character transformation for that column is needed. I think an issue should be opened in the rethomics project.

ggilestro commented 3 years ago

Yes, I would recommend moving this discussion to one of the rethomics forums https://github.com/rethomics