ComputationalAgronomy / biopathway-prediction

0 stars 0 forks source link

Meeting Schedule #2

Closed yehzx closed 1 year ago

yehzx commented 2 years ago

9/15 Agenda

    • [x] progress report, short discussion of the following steps and the schedule from now to version 1 workflow: NCBI DNA Sequences -> Prodigal -> Blastp -> Enzyme Mapping -> Result -> Visualization some questions about programming styles (e.g. do it in shell script or python, the best place to save temp files...)
    • [x] parallel running of programs I'll create a new issue to describe what's going on and how the result is different from what I expected

(1: highest priority, not necessary to finish them all) Not important: Something I wonder: if I create an ec2 instance with appended ssd and create an image, what if I load the template but with an instance not having appended ssd? what will happen?

Todo before early October (9/22, 9/29 two meetings left):

yehzx commented 2 years ago

9/22 Agenda

    • [x] github repo line ending settings Since I develop all the program on windows but will need to run it on Linux, we have to have a proper line ending conversion when I commit my files. But the situation is a bit complicated because I have both normal crlf ending files and lf ending files in my local folders. I've tried to fix it and currently it looked good running on EC2 instance. But I'm not sure if my settings can be generally applied to every situation. We may discuss this a bit.
    • [x] EC2 git clone via https I cannot clone my repo in the instance and have tried a few hours to deal with it. I think the problem might be the existed gitconfig that stops or prevents me from cloning my repo. And your gitconfig is write-protected, so I think we might need to deal with this together.
      Cloning into 'biofertilizer-prediction'...
      fatal: unable to get credential storage lock in 1000 ms: Permission denied
    • [x] version 1 is completed, let's see what's the next step and do the code review try running this
      python3 biofertilizer-prediction/main.py prodigal_test/GCF_000010725.1_genomic.fna

      Todo:

  1. need a config file to load the location of blast database. (otherwise everytime I change it I need to change the variable in run_blast.sh
stevenhwu commented 2 years ago

Regarding 9/22

  1. I think you can change the setting in VS code. Otherwise, regular expression FTW
  2. can you try it with git clone -v https://github.com/ComputationalAgronomy/biofertilizer-prediction.git I think you will get that error, but it still can clone the repo
    
    [zye21@ip-172-31-87-185 ~]$ whoami
    zye21
    [zye21@ip-172-31-87-185 ~]$ git clone https://github.com/ComputationalAgronomy/biofertilizer-prediction.git
    Cloning into 'biofertilizer-prediction'...
    fatal: unable to get credential storage lock in 1000 ms: Permission denied
    remote: Enumerating objects: 147, done.
    remote: Counting objects: 100% (147/147), done.
    remote: Compressing objects: 100% (78/78), done.
    remote: Total 147 (delta 83), reused 127 (delta 63), pack-reused 0
    Receiving objects: 100% (147/147), 28.62 KiB | 9.54 MiB/s, done.
    Resolving deltas: 100% (83/83), done.
yehzx commented 2 years ago

9/29 Agenda

    • [x] when to make the repo public I think open it during 10/7 - 10/21 is enough. Maybe let's go through the steps together another time or now.
    • [x] a short discussion of the paper, and see we can adapt what kinds of methods mentioned in the paper to improve this project.
    • [x] some questions about the code in my main.py and module.py Actually I think some code inside can be tidier and more elegant, because some parts are very similar. I just want to ask for your opinions and see what you will do if you want to implement the same things.
    • [x] terminate the ec2 instance automatically after sync with s3. Just curious about this because potentially we'll launch some larger instances and I don't want to monitor those from time to time.
stevenhwu commented 2 years ago
  1. Actually, since you created this repo, you might have the permission to change that.
  2. to discuss.
  3. Yes, there is something called subcommand in argpares, which can be combined with set_defaults and functions to avoid duplications.
  4. Yes, with aws ec2 terminate-instances you can do that automatically.
    
    aws s3 sync s3://data local
    python3 main.py --param xxx
    aws s3 sync result s3://data

aws ec2 terminate-instances --instance-ids INSTANCE_ID

stevenhwu commented 2 years ago

check nohup and & running things/job at the background

i.e.

nohup aws s3 cp SOURCE DEST  &
yehzx commented 1 year ago

10/6 Agenda

    • [x] Discuss some of my ideas to analyze the result See S3 bucket compagron-results/zye21/meeting/. To see how the pathway looks like, please refer to the diagram in the folder src/pathway/ (branch: development_IAA).
    • [x] A few things to make my life easier? #7

      Still in progress:

      refactor module.py

yehzx commented 1 year ago

10/13 Agenda

    • [x] Review the new commits in #8 I added and revised something after you refactored my code.
    • [x] Work in progress: refactor module.py Let's have a look at the branch refactor_main_and_module and see if I should keep completing it or modify something before I keep going.
    • [x] Batch download through FTP to aws s3? wget and curl?
yehzx commented 1 year ago

10/20 Agenda

    • [x] config class #10

No much other progress this week. I might do a bit paper review before running larger datasets and I'm still looking into how diamond-blastp result differs from blastp result. I might use diamond-blastp at a later stage because its freakingly fast. So let's finish our meeting early. BTW, I got an opportunity to the interview on 10/21.

yehzx commented 1 year ago

11/3 Agenda

    • [x] Configuration class review #10 I finished its prototype and had some questions wanted to discuss.
    • [x] A systematic way and training to improve CS foundation. I planned to do this later on and it would be one of my targets next sememster.
    • [x] The idea of the metagenomic data from rhizosphere
yehzx commented 1 year ago

11/10 Agenda

    • [x] How to change the mounted disk owner in EC2? I need sudo to run every command. Or generally we use other methods to solve it? (e.g. mount to my /home/zye21/... but not /...)
    • [x] Regarding xargs, the following works:
      export my_function
      echo "$my_file" | xargs -n 1 -I {} -P $cpus bash -c 'my_function {}'
      ...
      unset my_function
yehzx commented 1 year ago

11/17 Agenda

    • [x] Should I use code formatter?
yehzx commented 1 year ago

12/1 Agenda

yehzx commented 1 year ago

12/8 Agenda

    • [x] I went through the manual INDELible provides and roughly knew how to generate sequences.
      But I'm thinking how to apply these sequences to model validation and some details in implementation.
      Let's get some concrete ideas together.
      INDELible
yehzx commented 1 year ago

2/23 Agenda

    • [x] See some results in NAS
      In /home/zye21/meeting/
    • [x] Discuss the title for the thesis
      Are we going to emphasize the model or the IAA pathway? And how much effort should I put on describing the pipeline?
    • [x] Schedule some deadlines for each part (intro, materials and methods, ...) of the thesis
      I think I would like to start writing from the materials and methods part first and check if I miss something that should be done.
    • [ ] Code refactor?
yehzx commented 1 year ago

3/2 Agenda

    • [x] See some results in NAS
      In /home/zye21/meeting/
    • [x] Is it a good idea to compare my results to the previous study and mention that in my thesis?
      Since we use the same dataset but with different pipeline. Or I focus on my model and that's enough.

Regarding OKR:

    • [x] Though we have key results, we still need some objective criteria to keep track of the progress (not just the final ones). I just wonder how you evaluate how much you did over the last week or last month in industry?
yehzx commented 1 year ago

3/9 Agenda

    • [x] See results in NAS
      In /home/zye21/meeting/. Discuss which figures to put in the thesis.
yehzx commented 1 year ago

3/16 Agenda

Thesis link: https://docs.google.com/document/d/10lqDwVbDb0KZTmprmA6EVAgbG49xLiEe4aNTQ7K0ZyA/edit?usp=share_link

    • [x] See results in NAS
      In /home/zye21/meeting/. Discuss what to put in the thesis.
    • [x] Things to be careful about before building a database on a virtual machine.
yehzx commented 1 year ago

3/23 Agenda

Thesis link: https://docs.google.com/document/d/1bOeHPni19Em9lWqQFGDKYDTxwu88EGEu2UpmxfQ0tB0/edit?usp=sharing

yehzx commented 1 year ago

3/30 Agenda

Thesis link: https://docs.google.com/document/d/1hucHY0p9OeQA3M5L21cUbSoE8jRoj3zWA9QihPK1MVU/edit?usp=sharing

yehzx commented 1 year ago

4/6 Agenda

Thesis link: https://docs.google.com/document/d/12n7QD92aPdTcDsMJlWsc7O6dSViUn6WZvN5JJacKMzI/edit?usp=sharing

yehzx commented 1 year ago

4/8 latest thesis version

link: https://docs.google.com/document/d/12K3Hs8PA8XGnz5AcfRBnrr1bT6lsAd1929QWZBZ2I2A/edit?usp=sharing