drighelli / SpatialExperiment

55 stars 20 forks source link

feat: check existence instead of equality for colnames of imgData #143

Open senbaikang opened 1 year ago

senbaikang commented 1 year ago

This simple change lifts the restriction of requiring exactly four columns in imgData with names being c("sample_id", "image_id", "data", "scaleFactor"). Instead of checking their equality, it now checks their existence, which gives more flexibility for the data structure.

senbaikang commented 1 year ago

Is there anyone running tests on the code? If you have concerns we can talk about it.

drighelli commented 1 year ago

Hi @senbaikang,

thanks for your PR, and sorry for the late answer, we are all involved also in several projects and we find time to answer PRs and Issues as soon as we are able to do so.

I have a concern about your PR: what happens if one of the required columns is not there? Could you tell me if you wrote a unit test to check if everything works smoothly?

Thanks, Dario

senbaikang commented 1 year ago

Hi @drighelli,

Many thanks for the reply! And no worries, I completely understand :)

Regarding your concern, if any of the four required columns do not exist, it will have exactly the same behavior as it has right now. The only thing I did is lifting the restriction of equality to existence. Unfortunately I did not write any unit tests. If you think such tests are necessary I can do it tomorrow.

Thanks, Senbai

senbaikang commented 1 year ago

Hi Dario,

I have added a few lines of code for user-defined columns of imgData and the corresponding sanity check. I also added a unit test function named imgData_1 in test_SpatialExperiment-validity.R.

Let me know your thoughts :)

Thanks, Senbai

drighelli commented 1 year ago

Thanks Senbai,

I approved for the automatic checks, I'll take a look into the code as soon as I can!

Dario

senbaikang commented 11 months ago

Hi Dario,

It's been a while since we communicated last time. Could you let me know the status?

Thanks, Senbai

lmweber commented 8 months ago

Hi @senbaikang , thanks for this pull request, and apologies that we didn't reply earlier. I'm working through this now for the upcoming Bioconductor release cycle.

Could you please let us know some additional background regarding this issue, i.e. the context where it came up? In principle I think it makes sense to allow additional custom columns in imgData, so I wanted to make sure we understand the context.

A second question I have is regarding the additional code in the pull request. Could you please give us a quick summary of what the additional code you have added in the SpatialExperiment() function does? (i.e. code to allow custom columns as well as additional code to handle adding / removing columns, etc). Also, we would like to avoid adding additional package dependencies whenever possible, so is it possible to avoid using purrr so this does not need to be added as a dependency? Thank you!

senbaikang commented 8 months ago

Hi @lmweber,

Thank you for the follow-up.

Regarding the rational behind the additional custom columns in the imgData, it will allow us to add extra information of the image, e.g., its width and height, to the instance without reading from the image every time when we need it.

Regarding the code in the SpatialExperiment() function, it originally takes additional arguments via ... only for contructing a SingleCellExperiment object. For the purpose of adding custom columns to imgData, additional arguments passed to SpatialExpriment() could also have arguments for the addImg() function. Thus we need some extra code in SpatialExpriment() to handle it. This should work properly based on the unit test code I added before.

For the current dependence of purrr, I am able to remove it. I will commit the modification by tomorrow.

Thank you again for processing this. Please let me know if you have further questions.

Best, Senbai

lmweber commented 8 months ago

Great, thanks for the explanation. So the additional ... is required for passing additional arguments to the addImg() function, that makes sense.

Yes, if it is possible to do this all without using purrr then that would be better, thank you. (We previously had a lot of dependencies in this package and trimmed them down a while back, so we prefer not to increase them again.)

senbaikang commented 8 months ago

I quickly made some modifications to remove the dependencies, as well as testing it. Please double check and test it again :)

lmweber commented 8 months ago

Great, thank you, yes this additional update looks good for removing the new dependencies. We will continue to work through the rest of the update and will follow up here if we have any more comments.

lmweber commented 8 months ago

Hi, yes I think this looks good. I have added a few more commits above to do the following:

We will also need to add a version bump, but I will wait to add this after we have merged https://github.com/drighelli/SpatialExperiment/pull/148 so we don't get a merge conflict.

I also noticed the updates changing 'NULL' to NULL in some of the .Rd files were due to using the latest version of roxygen2 (7.3.1), which is fine.

@senbaikang could you please check my commits above look ok to you too?

Then it would also be great if either @drighelli or @HelenaLC get a chance to look at this before we merge. Thank you!

lmweber commented 8 months ago

The github actions checks above are failing since they are still using the old github actions workflow. Once https://github.com/drighelli/SpatialExperiment/pull/148 is merged they should be ok.

Checks are passing locally, and I also double-checked with a merged branch in my fork here: https://github.com/lmweber/SpatialExperiment/actions/runs/8301379820

senbaikang commented 8 months ago

Hi Lukas,

Your commits look great to me, thanks! Sorry for forgetting to remove purrr from the description file.

Senbai

drighelli commented 8 months ago

Thanks Lukas @lmweber and Senbai @senbaikang,

I like the idea of putting additional columns to the imgData structure, but before merging it, I have a question.

Could the ... argument go in conflict with the SingleCellExperiment ... argument? Sorry, even if I reviewed the code, I'm not sure about this point.

Dario

lmweber commented 8 months ago

Thanks for checking. Yes good point, I'm not sure about that. Maybe we could add another test to check for this.

senbaikang commented 8 months ago

Hi Dario,

Thanks for reviewing the code!

To answer your question, the ... argument passed to SpatialExperiment() indeed takes care of arguments of SingleCellExperiment(), SummarizedExperiment() and addImg(). The code I added gives SingleCellExperiment() and SummarizedExperiment() higher priority, meaning that if an argument in ... matches one of the argument of these two constructors, it will not go to addImg(). Note that, if I understand correctly, the ... argument of SingleCellExperiment() is actually passed to SummarizedExperiment(), which has only six possibilities: assays, rowData, rowRanges, colData, metadata, checkDimnames. So, as long as the newly added columns of imgData do not have the same names as those arguments of SingleCellExperiment() and SummarizedExperiment(), it should be alright.

Hope this is clear.

Senbai

drighelli commented 8 months ago

Hi @senbaikang,

thanks for the explanation, but maybe, as Lukas suggested, it would be better to write a unit test to verify this aspect before merging into the main branch.

Would this be possible for you?

Dario

senbaikang commented 8 months ago

Hi @drighelli and @lmweber,

I have added some tests for the additional arguments of the constructor. Please have a look and let me know if they are ok.

Senbai

drighelli commented 5 months ago

@senbaikang could you please resolve the conflicts? If needed Thanks!

senbaikang commented 5 months ago

Hi Dario,

It looks like only a change in the NEWS file. I have resolved the conflict.

Senbai

drighelli commented 5 months ago

it looks good to me, pinging @lmweber and @HelenaLC if they want to add some additional comments, otherwise I'll merge it on my next round across the PRs.

Thanks @senbaikang !