Open po60nani opened 9 months ago
Hi @po60nani,
Thanks for your interest in MAGIK!
This error is due to EdgeExtractor
expecting frames to start from zero. You can resolve this issue by nodesdf["frame"] -= nodesdf["frame"].min()
. We will clarify this in the documentation.
The problem is indeed related to what you mentioned. However, it is related not to offset = 0
but rather to nofedges = 0
.
In cases where offset
is 0 and nofedges
is greater than or equal to 0, np.choice
returns an empty array, which does not affect the function's performance. In such cases, duplicated_edges is simply assigned the value of edges.
The case you describe, in turn, arises when nofedges = 0
and offset > 0
(idx = np.random.choice(0, whatever > 0, replace=True
) reproduces the error), indicating that some graphs in your batch do not have any edges.
To better help you resolve this issue, we need to confirm some details:
ExtractEdges
function was not designed to be used as a standalone function. Instead, it was intended to be used within GraphExtractor
, which ensures the proper handling of the dataframe until the final graphs are generated. If you haven't already, I recommend using this function as we do in the tutorials.I suggest solving problem 1 and then check if 2 still persists!
Thank you for providing additional insights into the issue. I appreciate your effort in investigating the problem. As suggested, I will focus on solving problem 1 and then reevaluate if problem 2 persists.
I have implemented the suggested solution by adjusting the frames using nodesdf["frame"] -= nodesdf["frame"].min()
. However, the problem persists.
Could you please try with the following toy example?
import numpy as np
import pandas as pd
import deeptrack as dt
# like in your case
frame_shift = 0 # right case: 0
# randomly generated centroids
centroids = np.random.rand(80, 2)
frames = np.arange(0, 80) + frame_shift
nodesdf = pd.DataFrame()
nodesdf[["centroid-0", "centroid-1"]] = centroids
nodesdf["frame"] = frames
nodesdf["label"] = 0 # single particle
nodesdf["solution"] = 0
nodesdf["set"] = 0
# display the first 20 rows of the dataframe
nodesdf.head(20)
# Seach radius for the graph edges
radius = 0.7
# time window to associate nodes (in frames)
nofframes=3
# compute edges
edges = dt.models.gnns.graphs.EdgeExtractor(
nodesdf,
parenthood=np.ones((1, 2)) * -1,
radius=radius,
nofframes=nofframes
)
Here, frame_shift = 7
reproduces the issue:
nodesdf
edges
While, frame_shift=0
produces the correct output:
nodesdf
edges
I have thoroughly examined the provided toy example, and it accurately reproduces the expected results you shared. However, when applying the code to my dataset, I encountered an error. To facilitate the troubleshooting process, I have uploaded both the CSV file (df_PSFs.csv
) and the code for your review.
Code:
import deeptrack as dt
from deeptrack.models.gnns.generators import GraphGenerator
import pandas as pd
import numpy as np
import deeptrack as dt
import logging
logging.disable(logging.WARNING)
if __name__ == "__main__":
path_csv = r'./df_PSFs.csv'
nodesdf = pd.read_csv(path_csv)
print(nodesdf.head(20))
# normalize centroids between 0 and 1
nodesdf.loc[:, nodesdf.columns.str.contains("centroid")] = (
nodesdf.loc[:, nodesdf.columns.str.contains("centroid")]
/ np.array([1000.0, 1000.0])
)
nodesdf.loc[:, 'solution'] = 0.0
nodesdf.loc[:, 'set'] = 0.0
nodesdf["frame"] -= nodesdf["frame"].min()
# display the first 20 rows of the dataframe
nodesdf.head(20)
# Search radius for the graph edges
radius = 0.2
# Time window to associate nodes (in frames)
nofframes=3
# Compute edges
edges = dt.models.gnns.graphs.EdgeExtractor(
nodesdf,
parenthood=np.ones((1, 2)) * -1,
radius=radius,
nofframes=nofframes
)
a = 1
Additional Information: df_PSFs.csv
nodesdf
is:edges
is:Upon execution, I expect the code to run successfully without encountering any errors. The provided toy example validates this expectation, but the issue arises with my dataset.
Hi,
Thank you for your valuable network. I am currently trying to train MAGIK on my dataset, which has a structure similar to the ones in your tutorials. I encountered two issues during this process:
EdgeExtractor
function returns a DataFrame with extra columns containing NaN values. I noticed this discrepancy between my data and the provided tutorial data, and I'm not sure why these extra columns are present. As a workaround, I manually remove these extra columns before returning the data in the function.Input_df:
Output_df:
I traced this error back to the
SelfDuplicateEdgeAugmentation
function, specifically in the inner function whereoffset = maxnofedges - nofedges
results in an offset of 0. I'm unsure how to handle this situation and would appreciate guidance on resolving this issue.Any assistance or clarification on these matters would be greatly appreciated.
Best regards,