Open Phlya opened 10 years ago
Yeah, the fields
thing should definitly be corrected. I guess the package never got used enough to catch all these kind of bugs. As for your track missing the names of the features, I thought I recalled having something that would add blank fields if for instance the track was missing names or strand information... maybe it's not working correctly.
The story behind all this is that I wrote this package two ago ago when I was hired in a bioinformatics core facility. They wanted something to process and manipulate genomic tracks from different formats. This is the project we came up with. But plans were changed, the team moved on to something else, I started a PhD a in a different University and it never really got used.
It's a pity because I thought it was a nice idea with some potential and did invest about 6 months coding it. The genomics field really needs a universal parser library with an SQL (or HDF5) backend and a comprehensive interface in my opinion. Unfortunately I can't provide support for it anymore today as I'm overwhelmed with other things. But the code is GPL licensed so you are welcome to do whatever you want with it ! A few other users exist and @bow has contirubted to the package a bit in the past. Maybe you could ask him ?
It was one of my first python projects, I would do it differently today. I think I made the architecture a bit too complicated and would rather go for something requiring more lines of code from the user but much simpler to read and maintain now.
Right, I see... I is a very sad story, I would say, because as far, as I know, it is the most comprehensive python package for such things, while other can't really handle many formats. It also has very nice options of manipulating tracks, which are really useful. Of course, there are things to improve, not counting bugs, but still, I like it very much.
BTW I have posted an issue in Biopython's tracker about creating or adopting an existing parser for such files and suggested using track; would be great if you commented there.
As for me contributing myself... I would love to, but I am afraid I am not experienced enough, and my contribution won't do much good to the project.
As of your last point, I think that the good side (ease of use) is also very important; that's why I am using the package myself.
Can you link to the issue in Biopython's tracker ?
Hi! I know you said you don't really maintain the package anymore, but still, I found a weird thing and thought, you might improve it... For now I did a dirty hack in my installation where necessary, but without understanding the architecture of the package it is hard to do it properly.
So, the thing is, it seems, that all the manipulations with tracks, such as overlapping, assume that features of tracks have names and 2 other fields after it. You can see it in, for example, overlap.py file. make_feature function from git:
It caused error on the line with a call of _makename, because there was no a[2] or b[2] in my track - the features didn't have names.
This is how my dirty function looks and causes no trouble:
The change is quite obvious, and it solved the problem, but it is not really a good way to solve it.
Another thing, concerning fields, is that such thing doesn't work:
It causes this:
I had to comment a few line in the manipulate.py to make it work, though It probably now loses the field information from tracks (I don't have any fields except for start and end, so it is not a problem for now).
It would be really nice, if you looked into these issues.