dondi / GRNsight

Web app and service for modeling and visualizing gene regulatory networks.
http://dondi.github.io/GRNsight
BSD 3-Clause "New" or "Revised" License
17 stars 8 forks source link

Explore possible refactor to consolidate creation of new networks #897

Closed dondi closed 2 years ago

dondi commented 3 years ago

The fix to #886 revealed that adding empty meta properties had to be done in multiple places. It would be good to explore whether this can be consolidated into a single utility function (e.g., createEmptyNetwork or other appropriate name) so that future changes to the baseline network object would only have to be done in one place.

kdahlquist commented 3 years ago

This is giving me flashbacks to GRNmap... :)

igreen1 commented 3 years ago

Done! Added createEmptyNetwork and initNetwork createEmptyNetwork... creates an empty network initNetwork creates an empty network then assigns fields based on what's passed in (either fills all the fields, none, or something in between)

Commit #4ba0807 on igreen1 has the fix ready to pull :)

igreen1 commented 3 years ago

Would love comments on the name createEmptyNetwork seems verbose to me initNetwork feels,,, short. I think initialize is the technically correct term for assigning values to variables but it ignores the portion of the code that also creates an empty network. But 'createNetwork' or 'createNetworkwithArgs' felt worse. Will discuss at the meeting if not before

kdahlquist commented 3 years ago

createEmptyNetwork sounds OK to me! initializeNetwork is OK, too. I don't think it's bad to have names this long if they are clear. Someone else reading the code later is going to know exactly what you are talking about.

My only question is that is it creating an empty network only or is it creating an empty data structure (like all the data, like expression, etc.) in which case the "network" part of it might be misleading and it should be called createEmptyDataStructure or initializeDataStructure (not sure if the case is right here).

@dondi, thoughts?

dondi commented 3 years ago

Agreed, clarity is more important than brevity when it comes to naming. At least, that is the preference these days now that programming languages don't have the same memory constraints that they used to have.

Good point about network vs. data structure…I agree with the latter though data structure is too generic.

I mean technically it is the network still…just a highly annotated one at this point. Does the "network network" (the part consisting solely of nodes and edges) have a reasonable alternate name?

Naming is hard. Might need to think a bit. If we keep "network" for the nodes and edges (which does also make sense because in the workbook these are the "network sheets") we would need a new name for the whole shebang that is more specific than "data structure." 🤔

kdahlquist commented 3 years ago

Maybe we call it a "workbook" with a nod to Excel. That at least implies that it's more than just the network data.

dondi commented 3 years ago

“Workbook” did cross my mind. createEmptyWorkbook …that can work.

The other idea was “GRN”—would it be fair to say that the nodes + edges are just the network, but the whole thing is the gene regulatory network? So createGRN or maybe createGRNObject to indicate that it is a representation of the GRN.

Then again GRNObject can be confused with grnState, which is the runtime state of the web app.

I can go either way.

kdahlquist commented 3 years ago

I like workbook better than GRN. The biological concept of a GRN doesn't match what we mean here. Maybe in the future we visualize protein interaction data which definitely wouldn't fit. I'd rather it be more generic like workbook.

dondi commented 3 years ago

OK, createEmptyWorkbook it is then. @igreen1

igreen1 commented 3 years ago

Went to go make these changes this morning before class, and I ran into an issue with that name.

In the import controller for excel workbooks, the term workbook is used to refer to the object containing the excel information. It is a parameter of the function. I feel like that might cause some confusion. The whole function revolves around parsing a workbook, so initializing network to 'empty workbook' might imply that network contains an excel-style object.

As an additional note, most lines into which I added this function are of the form var network = myFunction() which is why I initially chose 'network' in the name. I think keeping network in the name somehow might be beneficial to follow the naming format used currently. Personally, I don't like calling this object network, but the whole codebase calls the object containing all this information 'network' (accessed in GRNState.network). Accessing state.network yields the object with {expression, nodes, edges, meta ...}, correct naming or not, that is our current code base.

We could expand this refactor to renaming! I'm happy to go through and rename in the files (VSCode is powerful for this).

Let me know if you would like me to power forward with the naming! I could just throw a comment in the excel import along the lines of // This creates an empty object to hold imported information

dondi commented 3 years ago

@igreen1 Yes, what you’re seeing is a holdover from older versions of the app, when the object used to be solely the network. The created object has expanded in scope since then (as you have seen) but the variable names haven’t kept pace.

What if we call the Excel references “workbook file” and then the derived object can have a name based on “workbook” without “file?” Would that align things better?

dondi commented 3 years ago

To avoid future confusion with either mixed names or leftover older uses of network as a name (due to how the software started), @igreen1 will take a stab at a thorough renaming. Factors to ease this include (a) availability of automated renaming tools in current editors and (b) relatively consistent previous naming.

To recap:

dondi commented 3 years ago

@Onariaginosa observes that the variable names may be different from network—other possibilities include data, test, or test + sheet name. This may need to be looked at further to determine what name is appropriate.

igreen1 commented 3 years ago

This has been mostly implemented in #1490500. The program is running okay but the tests are failing for unknown reasons

igreen1 commented 3 years ago

Found the problem! I hadn't modified test-files/ so

  1. The file names had network, not workbook.
  2. The workbookFile.name property was still network because the internal xslx name was network not workbook.

@igreen1 will probably have to manually change this as VSCode doesn't work very well with .xslx (hence the issue to begin with) but would love ideas on how to mass-edit .xlsx files!

dondi commented 3 years ago

We established at the meeting that @igreen1 will not need to make filename changes.

igreen1 commented 3 years ago

901 addresses this issue with the new information from last meeting. Nothing new to add beyond that now network --> workbook if it references the data structure used by the program. But, it is kept as network when referencing the worksheet from xlsx files

Workbook --> workbookFile in all cases since it was only ever used as a reference to a .xlsx file 👍 Tests pass and it all looks good

dondi commented 2 years ago

Just missed closing…better late than never…