Closed dezhi0730 closed 2 months ago
hi, thanks for the kind words :)
the intent of the transform is to shift and rescale the objectives to [0, 1]. They are supposed to be computed from the data, e.g.
shift = -1 * data.min()
scale = 1 / (data.max() - data.min())
upon inspection I think I forgot to update these values when I transitioned the code from an internal use case to the public version, thanks for flagging! Will double check that these values are consistent with the data and push an update if not
Looks like they are in fact incorrect. If you're curious, this is how I compute them
import pandas as pd
from cortex.data.dataset import TAPEFluorescenceDataset
train = TAPEFluorescenceDataset(
root="./.cache",
download=True,
train=True,
)
test = TAPEFluorescenceDataset(
root="./.cache",
download=True,
train=False,
)
df = pd.concat([train._data, test._data], ignore_index=True)
min = df.log_fluorescence.min()
range = df.log_fluorescence.max() - min
print(f"Min: {min}")
print(f"Range: {range}")
Hello,
The implementation and thought process of this project are impressive and have been very insightful.
And,I have a question regarding the GRAPH_OBJ_TRANSFORM dictionary in the code. The specific section is as follows:
Could you please clarify how the values for "scale" and "shift" were determined for the "stability" and "log_fluorescence" transformations?
Are these values derived from the mean and variance of the training data, or is there another method or rationale behind their selection?
Understanding the origin and reasoning behind these values would help in better comprehending the data preprocessing steps.
Thank you!
Best regards,