charlottewright / lep_fusion_fission_finder

A tool to assign ancestral linkage units and/or identify fusion/fission events in Lepidopteran chromosomes based on a set of reference BUSCO genes as markers.
MIT License
6 stars 1 forks source link

Question about map_fusion_fission_client.py #1

Open XuanZhang-Black opened 2 months ago

XuanZhang-Black commented 2 months ago

Dear Dr. Charlottewright:

Hello! When I ran the code you provided to infer the lepidoptera chromosome fusion event, an error occurred and I do not know how to solve it. The following is my code and error information: "python3 map_fusion_fission_client.py -i fusion_finder/ -tree ../syngraph/4707.newick -t 1 -o ./ -f output -l Ture" The following information is displayed: [+] Running map_fusion_fissions.py with a threshold of 1 [+] Parsing chromosome assignment files [+] Parsing the tree [+] Summarising status of each chromosome Traceback (most recent call last): File "~/unmasked_genome/LFFF/map_fusion_fission_client.py", line 45, in df_combined, spp_list = gather_stats_and_make_table(file_list, input_data) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "~/unmasked_genome/LFFF/merian_tools.py", line 117, in gather_stats_and_make_table for query_chr in file.query_chr: ^^^^^^^^^^^^^^ File "~/miniconda3/lib/python3.11/site-packages/pandas/core/generic.py", line 6299, in getattr return object.getattribute(self, name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'DataFrame' object has no attribute 'query_chr'. I put the lep_fusion_fission_finder.py results for each species in the fusion_finder folder, and the chromosome numbers for each species are 1-31. Can you tell me how to solve it?

Another problem is that I use syngraph and the species at the base of my species tree as reference. Under the -m2 mode, I infer that chromosome fusion of my species is 13 times, while using LFFF method -w 17, I infer that my species is a butterfly for 12 times. May I ask how the conflict between these two results should be explained? I see you say in the article that the results of the two methods are almost the same, and my species is also a butterfly of the Lepidoptera.

Sincerely, Xuan Zhang

charlottewright commented 1 month ago

Hi Xuan,

Thank you for trying out this tool! It seems the problem is accessing the output from the first command (lep_fusion_split_finder.py). Please could you check whether your output from this does include files with the format "SPP_chromosome_assignments.tsv" where the first line in the file has the header "query_chr status assigned_ref_chr" followed by declarations for each of your chromosomes? The line "AttributeError: 'DataFrame' object has no attribute 'query_chr'." makes me think this isn't being found. If they do look ok, feel free to send me your output from lep_fusion_split_finder.py and your tree I can try to look into it further.

The difference between the number of fusions detected by syngraph and lep_fusion_fission_finder is most likely due to the different thresholds of sensitivity that are being used by each tool. If you are not specifying a number of markers for a given chromosome to be detected with syngraph, then the default of 5 is being used, which is more sensitive than the threshold of 17 with lep_fusion_fission_finder. Playing with this number of markers with both tools should help you to get a feel of what changes with this and what you think the best cut-off should be. It may also help to visualise your fusions to see which fusion is being missed when you have 12 rather than 13, and whether this is a genuine fusion that should be detected. To do this, you could upload your BUSCO table for the species of interest here: https://charlottejwright.shinyapps.io/busco_painter/ or run it on the command line via this tool https://github.com/charlottewright/lep_busco_painter.

Best wishes,

Charlotte