Closed dmx2 closed 8 months ago
When I run make proteome
I get this error:
File "/home/joverton/arborist/src/protein_tree/src/sql_engine.py", line 12, in create_sql_engine
return create_engine(f"mysql+mysqlconnector://{user}:{password}@{host}:{port}/{database}")
It fails because I don't have port
set, but I don't want to connect to MySQL when running Arborist.
When I run
make proteome
I get this error:File "/home/joverton/arborist/src/protein_tree/src/sql_engine.py", line 12, in create_sql_engine return create_engine(f"mysql+mysqlconnector://{user}:{password}@{host}:{port}/{database}")
It fails because I don't have
port
set, but I don't want to connect to MySQL when running Arborist.
Yep, I can fix this no problem for make proteome
. Don't you need the connection when running make iedb
and hence make all
if you do a full start to end build though?
I pulled the latest commits but I'm still getting the same error.
@jamesaoverton Sorry! This is likely due to the protein tree submodule hasn't been updated. I made a lot of commits to that codebase and it should be running now. I pushed the updated version it should work. The only problem I see is that the makeblastdb
and blastp
are not in bin/ when running make deps
- so this might need to be fixed.
Also, make sure if you use git pull
you do git pull --recurse-submodules
so the submodule updates.
make proteome
works for me now. make protein
is not working yet. I'd like two changes, please:
bin/
. The Makefile checks that all the required binaries are on the PATH, which supports installation via system packages, otherwise it installs them to bin/
. I prefer the system packages for BLAST and HMMER.build/arborist/manual_assignments.tsv
to the repository, or add a rule to fetch it in the Makefile.make protein
should work now with your requested changes. Let me know how it goes.
I think that manual-parents.tsv
is missing a column. I pulled the latest code, ran make protein
. It fetches the Google Sheet but fails with this error:
File "/home/joverton/arborist/src/protein_tree/protein_tree/assign.py", line 425, in _assign_manuals
manual_gene_map = manual_df.set_index('Accession')['Accession Gene'].to_dict()
@jamesaoverton Yes, I ran into this problem a few days ago and I've asked Randi to add the gene symbols to the SoT sheet. I CC'd you to the email thread, but we will make a copy first and pull from there for now. I'll add the gene symbols.
Ok, I made a copy and I changed the URL - it should work now.... hopefully 🤞
Protein tree has been undone as a submodule. It works as a separate directory now. Merging and then we can continue to make changes to Makefile regarding the make proteome
and make protein
as needed.
make proteome
for selecting all proteomes works now.We will need to sort out the
proteome.tsv
file issue and end up making the output ofselect_proteome.py
the actual target.make protein
still in the works, but getting there.My protein tree codebase is now a submodule so any edits I am making to that codebase will be pushed there and then Arborist will have to pull the changes.