gardners / 2014SE3

Software Engineering 3, Semester 1 2014.
2 stars 28 forks source link

Research Paper (algh0065) #200

Closed algh0065 closed 10 years ago

algh0065 commented 10 years ago
                                         NASA Mars Climate Orbiter

                                                  Saleh Alghamdi

                                                      algh0065

                   School of Computer Science, Engineering and Mathematics 
                                  Flinders University of South Australia 
                            PO Box 2100, Adelaide 5001, South Australia
                                            Algh0065@flinders.edu.au

1 Introduction

    The mars climate orbiter was a robotic space probe that was launched into space by NASA in 1998 for the completion of climatic studies, atmospheric and the surface changes taking place in Mars (MCO, 2000). The orbiter was also supposed to be the communications relay for the Mars Polar Lander. NASA launched this device into Mars to achieve several science objectives. First, the device was expected to monitor the weather & atmospheric conditions in Mars on a daily basis. The climate orbiter was also required to record and report any evidence of climate change. Other objectives included an analysis of the water distribution, temperature profiles, water vapour, and dust content on the Martian Surface.

2 The NASA Mars Orbiter

2.1 A catastrophic Software Failure:

    The NASA Mars Orbiter was developed and launched at a time when NASA was operating under the “Faster, Better, Cheaper” philosophy that was supposed to encourage the organization to achieve more objectives with fewer resources (MCO, 2000). The organization aimed at increasing productivity & innovation while at the same time ensuring cost-effective and safe approaches to the achievement of mission success. The “Faster, Better, Cheaper” initiative led to the restructuring of many missions and programs to achieve the objectives of this approach (MCO, 2000). Using this approach, the costs incurred in programs were highly reduced but the infusion of technology and content were increased. This is the background and philosophy under which NASA was operating when the Mars Orbiter was launched. This approach may be a contributing factor to the failure in the greater scheme of things.

This is because, as the implementation of projects using this approach evolved, the attention on schedule & cost increased leading to unacceptable levels of risk in NASA projects (MCO, 2000). Even today, NASA may be operating using an approach where there is very high risk in some initiatives by increasing scope and reducing cost (Stone, 2000). 

2.2 The NASA Mars Orbiter Software Failure:

Reports indicate that the failure of the NASA Mars Orbiter was caused by a catastrophic failure in a piece of ground software. The failure on the ground software led to some discrepancies since this software calculated & produced its results in imperial units while the system that was expecting the results of these calculations was expecting them in ISO / metric units (Hoekenga, 2012). Therefore, the error in the navigation of the obiter device was caused by the software that controlled the thrusters since it was not calibrated in SI units i.e. the navigation device in the orbiter expected to receive the values in Newtons while the computer that completed the calculations delivered pound-forces (Board, 1999). According to Douglas & Don (1999), this piece of ground software may have been inadequately tested leading to the errors that were experienced in the calculations. As a result of the delivery of results using the wrong units there was a discrepancy in the calculated & measured positions of the orbiter which led to a large discrepancy in the location of the device in terms of its expected & actual position in orbital insertion altitude. Even though this was earlier noted by some engineers, their concerns were dismissed but later it emerged that the orbiter was in the wrong position in terms of its orbital insertion altitude (Douglas & Don, 1999). This discrepancy led to the loss of the orbiter which probably crashed in an unknown location. After the error was discovered, an urgent meeting of propulsion engineers, managers, software developers and trajectory operators was called to evaluate the possibility if conducting a trajectory manoeuvre. However, this manoeuvre was ultimately dismissed as the NASA Mars Orbiter was considered lost. There were some responses by NASA following this catastrophic failure and loss of the orbiter. In the next section, we shall consider the actions taken by NASA and the appropriateness of these measures.

2.3 Responses to the Failure of the Software and their Appropriateness:

Immediately after the loss of the orbiter, NASA constituted a board called the “Mars Climate Orbiter Board” to investigate the matter and forward their recommendations to NASA (MCO, 2000). The report produced by the board cited some contributory factors that led to this catastrophic failure. The report noted that the errors that were being experienced but went undetected within the computer models of the ground-based systems in relation to how thruster firings were being predicted and executed. The computer based models on the ground missed the errors experienced in the thrust firings while the orbiter was in its interplanetary journey to Mars. MCO (2000) noted that the software error that led to the loss of the orbiter was not the first error by the software. The author predicts that there were other small errors in the interplanetary trip that were not noted by ground computer models. 

Another contributory factor as noted by the board was that the navigation team in the NASA operations centre was not quite informed on the details of how the orbiter was pointed in space as accurately as it was done in other programs. Further, the fact that the optional thrust firing to raise the path of the orbiter relative to mars was considered but it was not performed was also a contributory factor. The board also noted that the Systems Engineering department of NASA was required to double-check the operation of all interconnected parts but this was not done well (MCO, 2000). There was also the problem of informal communication channels among engineering groups and some personnel were not trained sufficiently on the navigational aspects of the operation. Ultimately, the process of verification and validation of systems engineering requirements and technical aspects of the project was inadequate.

When the software error was discovered, NASA called an immediate meeting of propulsion engineers, managers, software developers and trajectory operators was called to evaluate the possibility if conducting a TCM-5 (Trajectory Correction Maneuver-5). However, this manoeuvre was never done after consultations with NASA top management. After the investigations and recommendations by the board, NASA took some actions to ensure that such a failure would not occur again. In response to the recommendations of the board, NASA has taken some efforts to ensure that the systems engineering team is fully staffed at the beginning of each project. Further, the organization makes sure that the engineering team is in possession of the required skills to work with the subsystem engineers and set up an efficient communication flow. NASA also engages personnel in operations at the early stages of the project to ensure that they have a detailed knowledge of the project. In the project formulation phase, a set of mission requirements is developed early in the project formulation phase, Further, a “through flow-down” of the requirements of every system is done to the subsystem level (MCO, 2000). According to MCO (2000), NASA also conducts system analyses to identify mission risks in all segments and sub-systems in the project. The project teams must also work thoroughly to make “trade-off decisions” that treat & mitigate risks in a bid to increase the probability of mission success (MCO, 2000). A recommendation has also been made and approved to deploy alternative navigational schemes in space missions and “relative navigation” is done when the device is in the vicinity of other planets (MCO, 2000). Technology developments for optical tracking, autonomous orbital determination based on GPS is also being pursued.

The immediate action taken by NASA to conduct an emergency meeting of propulsion engineers, managers, software developers and trajectory operators was appropriate and it may have resulted in positive impacts on the project. However, their resolution to dismiss the TCM-5 (Trajectory Correction Maneuver-5) meant that this immediate action did not achieve any result. I think that this effort did not lead to positive results for the mission since the team did not conduct the TCM-5 manoeuvre. The team should have completed the TCM-5 manoeuvre since it would have had no further consequences on the costs of the mission.

The long-term actions mentioned above will have far reaching implications for NASA. After the recommendations of the investigations board, the actions of NASA to establish a comprehensive engineering team, personnel engagement and training will lead to more costs being incurred. However, even though more costs will be incurred, these actions are appropriate since they can prevent mission failure which leads to even greater financial losses. For instance, the loss of the orbiter led to a $125 million loss. Taking actions such as the ones recommended above will led to increased costs but these will not be in the scale of the cost of mission failure. Their actions to perform continual system analysis, risk identification, tracking technology and alternative navigational schemes are also appropriate but they will lead to more costs being incurred. Similarly, these costs do not surpass the costs of mission failure which means that these actions are justifiable.

3 Conclusion

The loss of the NASA Mars Climate Orbiter led to a loss of at least $125 million – the loss was due to a failure in software. The ground based software calculated its output in imperial units while the orbiter was expecting these results in ISO / Metric Units. The measures taken by NASA in accordance to the report by the investigation board included the establishment of a comprehensive engineering team, personnel engagement and training. Other measures included performing continual system analysis, risk identification, using tracking technology and developing alternative navigational schemes. Even though these measures will increase operational costs for NASA, they are appropriate since they can be justified by the fact that the additional costs cannot be compared to the costs of mission failure.

  4 Reference List

Board, M. C. (1999). Mars Climate Orbiter Mishap Investigation Board Phase I Report. Retrieved 6 5, 2014, from ftp://ftp.hq.nasa.gov/pub/pao/reports/1999/MCO_report.pdf.

Douglas, I., & Don, S. (1999, 11 10). MARS CLIMATE ORBITER FAILURE BOARD RELEASES REPORT, NUMEROUS NASA ACTIONS UNDERWAY IN RESPONSE. Retrieved 6 5, 2014, from http://mars.jpl.nasa.gov/msp98/news/mco991110.html.

Hoekenga, C. (2012, 9 12). Tragedies in Science: The Crash of the Mars Climate Orbiter. Retrieved 6 5, 2014, from. http://www.visionlearning.com/blog/2012/09/21/tragedies-in-science-the-crash-of-the-mars-climate-orbiter/.

Hotz, R. L. (1999, 10 1). Mars Probe Lost Due to Simple Math Error. Retrieved 6 5, 2014, from http://articles.latimes.com/1999/oct/01/news/mn-17288.

MCO, M. C. (2000, 3 13). Report on Project Management in NASA by the Mars Climate Orbiter Mishap Investigation Board. Retrieved 6 5, 2014, from ftp://ftp.hq.nasa.gov/pub/pao/reports/2000/MCO_MIB_Report.pdf.

Stone, E. (2000). MARS PRESS CONFERENCE. Retrieved 6 6, 2014, from ftp://ftp.hq.nasa.gov/pub/pao/reports/1999/MCO_charts.pdf.

algh0065 commented 10 years ago

Hi Paul

Sorry for inconvenience. When I submitted my research paper after creating a new issue, it distributed like this. I do not know what is the problem. I have created an issue in my repository and the same problem has happened.

Best Regards

gardners commented 10 years ago

Already marked.