Closed JoannaBlatt closed 8 months ago
"It would be awesome to be able to upload multiple plants at the same time - and scroll through the observed RSA vs. calculated Pareto Front solutions - so some kind of drop-down selection menu based on the PlantID / RootID would be good (in the example input data - that would be in IMAGE column."
IMAGE column is used as the plant id in csv data
It looks like the data will be submitted cleaned so I think skipping step one is fine.
20210203_PimpiBig_analysis.Rmd might actually be a better readme. Has more information about stuff than other parts.
Probably we should load this using R studio or something. Maybe?
Also the environment.yml file and imports should help with building the environment to run it.
Would probably also be good to figure out exactly what code we need for what they want. It seems like the data is clean, but does that mean all we do is run "math blackbox code" on each identified (using IMAGE) plant and simply displaying the results?
What is "'genotype_replicate_condition_hormone' format"?
In step 3, do we not list the name for the new architecture files? What exactly are these files like?
It looks like the initial data we are given might already be architecture files.
It might be we just need steps 5 and 6 if I'm reading the data and steps correctly
3:45 Going to read through the code and try lines 5 and 6
used pip install numpy
, pip install networkx
, pip install scipy
, pip install matplotlib
, pip install pandas
, pip install seaborn
-- this seems to set up the environment to the point where it actually starts trying to implement the code.
Plant-Architecture\analyze_arbors.py", line 37, in analyze_arbors
with open(fname, 'a') as f:
^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'data/results/statistics/arbor_stats.csv'
The above is the current running issue I'm looking into related to trying line 5. I will look into whether the data is actually cleaned and where/when this file gets written.
This is just notes about what gets used and should therefore be ported over: analyze arbors uses info from constants.py
planning to read through more of the code to hunt down the step that creates the file or simply make the folders and put my file in there with the name changed appropriately.
using python write_metadata.py
caused error with pandas. Fixed with pip install pyarrow
Step three gives no errors, but also doesn't say anything. Assumed it ran correctly.
Step 4 gave error
\read_arbor_reconstruction.py", line 173, in main
for arbor in os.listdir(RECONSTRUCTIONS_DIR):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [WinError 3] The system cannot find the path specified: 'data/architecture-data/arbor-reconstructions'
So that will be the next thing to look into as well. It looks like, due to it making all these directories and data files you have to go through the steps even if the data is in the correct final form. It might be something to fix by either copying the repo and editing the code to not make all this and just use the given file or to manually create the directories and name the file in a way that works for the current code.
forgot to update issue when I started looking at things again. Started at 8:30. Got word back that the data is cleaned so I'll make sure to use the correct one (the example from Magda may or may not be cleaned, but Dr. C's is)
confused about step two. It doesn't seem to modify data folder or add anything. It should be writing some metadata.csv but I don't see it. Also assuming it's two commands right now.
Looked at rm command and seems to be force, not filter, I'm assuming.
When trying step 4 I get issues with missing files or directories. Moved the data file to the directory it seemed to be expected in but now getting the below error:
Traceback (most recent call last):
File "C:\Users\nethe\OneDrive\Documents\GitHub\Plant-Architecture\write_architecture_files.py", line 128, in <module>
main()
File "C:\Users\nethe\OneDrive\Documents\GitHub\Plant-Architecture\write_architecture_files.py", line 125, in main
write_arbor_files_full(raw_data_fname, RECONSTRUCTIONS_DIR)
File "C:\Users\nethe\OneDrive\Documents\GitHub\Plant-Architecture\write_architecture_files.py", line 53, in write_arbor_files_full
write_arbor_file_full(output_fname, curr_main_root_points, curr_lateral_roots)
File "C:\Users\nethe\OneDrive\Documents\GitHub\Plant-Architecture\write_architecture_files.py", line 19, in File "C:\Users\nethe\OneDrive\Documents\GitHub\Plant-Architecture\write_architecture_files.py", line 19, in write_arbor_file_full
with open(output_fname, 'w') as f:
^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'data/architecture-data/arbor-reconstructions/282_1_C_day3.csv'
And now write metadata says file doesn't exist as well.
Warnings I'm currently getting: step 3: when doing the remove command with force it says object not found step 4: missing files or directories
planning to work on creating a file handling flow chart for the code since that seems to be causing issues now and I'm not having as much luck as with my piece by piece strategy for the environment. Although I might just try and trouble shoot 5 and 6 one more time before doing so.
currently step 5 gives FileNotFoundError: [Errno 2] No such file or directory: 'data/results/statistics/arbor_stats.csv'
at
\analyze_arbors.py", line 37, in analyze_arbors
with open(fname, 'a') as f:
Forgot to post a follow up. I was tracing through write_metadata.py in my repository and adding a few notes.
Also looking to make sure the metadata folder wasn't just hidden
Trying to figure out when the metadata folder is created and the metadata.csv is created. Seems to be made before step 2 which is the first step that has any commands given.
waiting on reply to email about metadata folder and csv file. Working on step 3 which results in:
python write_architecture_files.py pimpi_Big1_D3_Root_Nodes.csv
pimpi_Big1_D3_Root_Nodes.csv
Traceback (most recent call last):
File "Plant-Architecture\write_architecture_files.py", line 128, in <module>
main()
File "Plant-Architecture\write_architecture_files.py", line 125, in main
write_arbor_files_full(raw_data_fname, RECONSTRUCTIONS_DIR)
File "Plant-Architecture\write_architecture_files.py", line 53, in File "Plant-Architecture\write_architecture_files.py", line 53, in write_arbor_files_full
write_arbor_file_full(output_fname, curr_main_root_points, curr_lateral_roots)
File "Plant-Architecture\write_architecture_files.py", line 19, in write_arbor_file_full
with open(output_fname, 'w') as f:
^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'data/architecture-data/arbor-reconstructions/282_1_C_day3.csv'
currently tracing through write_architecture_files.py, trying to understand the reading of the csv and this file/when it should be created.
continuing with checking write_architecture_files.py
maybe I should just add the directories that are missing? Also, it should create the file.
added the missing directories and things started working. So needed to create in the -data folder: results -results: pareto-fronts, statistics -architecture-data: arbor-reconstructions
was able to run steps 3 and 4. Started on 5, waiting for it to complete
All steps from 3 down work now, but it doesn't seem to make visualizations. However, there are a lot of files created.
So It looks like there's more information in the .ipynb files and they have some figure creating capabilities, but none of the figures look like the ones they want. I'm not exactly sure what he uses to make the visualizations we were expecting to see, or what exactly they want to see at this point. Creating a file of the output and allowing the user to download will probably not be an issue depending on how the information can be handled. I will look into that. At this point I will send an email about visualizations to Dr. C and we should check in with Magda about what she wanted exactly to be displayed.
Read through analyze_arbors and arbor_statistics, drawing is definitely done with arbor_statistics file as it imports pylab. Line 46 needs to be uncommented to draw. Not sure about what command line though.
Forgot to comments when I started looking at code about 6 minutes ago
apparently the files I thought would be math related and that he gave no commands for in the readme are the ones related to graph creation. So helpful. Anyway, to get running:
create metadata directoryin data directory
pip install pingouin
current issue:
'data/scoring-data/manual-scoring-last-day.csv'
while trying to run python arbor_statistics.py --histogram
in repo, create folder figs. In figs create folders drawings and plots. In plots make folders null-models-analysis and pareto-front-location-analysis. In drawings create folders arbors, toy-network, and pareto-front-drawings. I went ahead and made the null-models folder in the results folder.
Ok, well, I still can't find where the file it's trying to read is being made. I've tried running all the commands again, but it doesn't seem to change anything (as I expected). I'm trying to run command python arbor_statistics.py --time
Looking through code in relation to the fact that manual data was being required.
Tried commenting out parts related to manual-scoring-last-day.csv. However this has not proved successful. When commented out the time plot produces an empty plot that might have a single vertical line in the middle. Unsure how to recreate this data or get around it.
There's some more plotting code commented out in pareto_functions.py that might be making a pareto front graph, but I'm not sure about uncommenting it. Also this file seems to rely on a specific hard coded .csv file.
Looking further into the pareto front stuff
Still having trouble figuring out how to create the graphics being requested.
Going to try reading through each file thoroughly to see if/what I'm missing.
Forked repository and added info to the readme
Going to work on the new repo for plant architecture. Set up and checking which files import pylab.
Opened codespace, doesn't seem to want to load source control. Found draw arbors function in util.py but when trying to find references nothing else came up.
Found also a function that looks to draw in pareto_functions.py. Going to look through files with main for parameter argument parsing
Almost done going through a first pass looking at all files with main functions and creating notes in the readme.
finished reading and committed notes made to readme. Working on getting codespace workable. Was able to use oxybuild
and pip install pingouin
to get through step 5.
I was able to complete all 6 steps. Edited out references to manual_scoreing_last_day.csv. After creating figs/plots folders ran python arbor_statistics.py --time
however, it produced the same graphic which just had a straight line. I'm not sure what might be going wrong, if it's the data or a missing step in the pipeline.
Created null-models
folder in results folder. Ran command python null_models.py -a
which worked.
Going to try and thoroughly read through null_models_analysis.ipynb for the next step since it does have some graphics and might help with the pipeline issue.
Command python null_models.py -w
also worked.
Will look more into the null_models_analysis.ipynb for some insight later.
created a makeDirectories.py file to automatically make all the directories needed. This is just to help with setup when showing what the code currently does.
hard coded a file into pareto_functions.py to try and test viz_front method, but had an error on append.
python pareto_functions.py
Traceback (most recent call last):
File "/workspaces/Plant-Architecture/pareto_functions.py", line 352, in <module>
main()
File "/workspaces/Plant-Architecture/pareto_functions.py", line 348, in main
viz_front(G)
File "/workspaces/Plant-Architecture/pareto_functions.py", line 334, in viz_front
scatter_df = tree_costs.append(pareto_front)
File "/home/codespace/.local/lib/python3.10/site-packages/pandas/core/generic.py", line 6204, in __getattr__
return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'append'. Did you mean: '_append'?
Did some more looking to see if I could figure it out.
Looking into files that are generated and drafting an email about output
It looks like append might be a deprecated function and need to be replaced, or I need to use an older version of pandas.
I got an image! 189_3_C_day3-pareto-front.pdf
This might be what he wants. @SeraphinaStephen
This one was made with a file that looks different than the one used for the previous image 194_1_C_day3-pareto-front.pdf
to produce images for all files in arbor-reconstructions, just run python pareto_functions.py
Looked at home page to get a sense of how things will interact
working on making sure I have the correct repository on my own computer and that things still work. Trying to streamline the file processing procedure to make it easier when doing through the website. Having difficulty getting vscode to set up a nice python environment for it though.
working on file flow and folders that will be needed when hosting on website. #8
Still working on file flow chart. Trying to minimize the number of directories we would need. And make sure we only use the files or functions being used for the project.
Still working on this
This issue holds notes and timestamps for working on Dr. C's code.